AI Tools & Frameworks Directory

Discover essential tools, libraries, and frameworks to power your AI workflows.

Tool• Mar 11, 2026

The Google tool helping small AI models outperform the giants

What if the secret to better AI isn’t bigger models, but better tools? Researchers at Google DeepMind have shown that smaller language models can outperform larger ones when they’re given the ability to write their own code.

#AI Accelerator Institute#AI#Research

Tool• Mar 11, 2026

MASEval: Extending Multi-Agent Evaluation from Models to Systems

arXiv:2603.08835v1 Announce Type: new Abstract: The rapid adoption of LLM-based agentic systems has produced a rich ecosystem of frameworks (smolagents, LangGraph, AutoGen, CAMEL, LlamaIndex, i.a.). Yet existing benchmarks are model-centric: they fix the agentic setup and do not compare other syste...

#ArXiv#Machine Learning#Academic

Tool• Mar 11, 2026

LDP: An Identity-Aware Protocol for Multi-Agent LLM Systems

arXiv:2603.08852v1 Announce Type: new Abstract: As multi-agent AI systems grow in complexity, the protocols connecting them constrain their capabilities. Current protocols such as A2A and MCP do not expose model-level properties as first-class primitives, ignoring properties fundamental to effectiv...

#ArXiv#Machine Learning#Academic

Tool• Mar 11, 2026

Quantifying the Accuracy and Cost Impact of Design Decisions in Budget-Constrained Agentic LLM Search

arXiv:2603.08877v1 Announce Type: new Abstract: Agentic Retrieval-Augmented Generation (RAG) systems combine iterative search, planning prompts, and retrieval backends, but deployed settings impose explicit budgets on tool calls and completion tokens. We present a controlled measurement study of ho...

#ArXiv#Machine Learning#Academic

Tool• Mar 11, 2026

Interpretable Markov-Based Spatiotemporal Risk Surfaces for Missing-Child Search Planning with Reinforcement Learning and LLM-Based Quality Assurance

arXiv:2603.08933v1 Announce Type: new Abstract: The first 72 hours of a missing-child investigation are critical for successful recovery. However, law enforcement agencies often face fragmented, unstructured data and a lack of dynamic, geospatial predictive tools. Our system, Guardian, provides an ...

#ArXiv#Machine Learning#Academic

Tool• Mar 11, 2026

AgentOS: From Application Silos to a Natural Language-Driven Data Ecosystem

arXiv:2603.08938v1 Announce Type: new Abstract: The rapid emergence of open-source, locally hosted intelligent agents marks a critical inflection point in human-computer interaction. Systems such as OpenClaw demonstrate that Large Language Model (LLM)-based agents can autonomously operate local com...

#ArXiv#Machine Learning#Academic

Tool• Mar 11, 2026

Equitable Multi-Task Learning for AI-RANs

arXiv:2603.08717v1 Announce Type: new Abstract: AI-enabled Radio Access Networks (AI-RANs) are expected to serve heterogeneous users with time-varying learning tasks over shared edge resources. Ensuring equitable inference performance across these users requires adaptive and fair learning mechanism...

#ArXiv#Machine Learning#Academic

Tool• Mar 11, 2026

Hindsight Credit Assignment for Long-Horizon LLM Agents

arXiv:2603.08754v1 Announce Type: new Abstract: Large Language Model (LLM) agents often face significant credit assignment challenges in long-horizon, multi-step tasks due to sparse rewards. Existing value-free methods, such as Group Relative Policy Optimization (GRPO), encounter two fundamental bo...

#ArXiv#Machine Learning#Academic

Tool• Mar 11, 2026

Generalized Reduction to the Isotropy for Flexible Equivariant Neural Fields

arXiv:2603.08758v1 Announce Type: new Abstract: Many geometric learning problems require invariants on heterogeneous product spaces, i.e., products of distinct spaces carrying different group actions, where standard techniques do not directly apply. We show that, when a group $G$ acts transitively ...

#ArXiv#Machine Learning#Academic

Tool• Mar 11, 2026

SPREAD: Subspace Representation Distillation for Lifelong Imitation Learning

arXiv:2603.08763v1 Announce Type: new Abstract: A key challenge in lifelong imitation learning (LIL) is enabling agents to acquire new skills from expert demonstrations while retaining prior knowledge. This requires preserving the low-dimensional manifolds and geometric structures that underlie tas...

#ArXiv#Machine Learning#Academic

Tool• Mar 10, 2026

From raw interaction to reusable knowledge: Rethinking memory for AI agents

It seems counterintuitive: giving AI agents more memory can make them less effective. As interaction logs accumulate, they grow large, fill with irrelevant content, and become increasingly difficult to use. More memory means that agents must search through larger volumes of past interactions to find...

#Microsoft#Research

Tool• Mar 10, 2026

vLLM Hook v0: A Plug-in for Programming Model Internals on vLLM

arXiv:2603.06588v1 Announce Type: new Abstract: Modern artificial intelligence (AI) models are deployed on inference engines to optimize runtime efficiency and resource allocation, particularly for transformer-based large language models (LLMs). The vLLM project is a major open-source library to su...

#ArXiv#Machine Learning#Academic

Tool• Mar 10, 2026

How Attention Sinks Emerge in Large Language Models: An Interpretability Perspective

arXiv:2603.06591v1 Announce Type: new Abstract: Large Language Models (LLMs) often allocate disproportionate attention to specific tokens, a phenomenon commonly referred to as the attention sink. While such sinks are generally considered detrimental, prior studies have identified a notable exceptio...

#ArXiv#Machine Learning#Academic

Tool• Mar 10, 2026

FuzzingRL: Reinforcement Fuzz-Testing for Revealing VLM Failures

arXiv:2603.06600v1 Announce Type: new Abstract: Vision Language Models (VLMs) are prone to errors, and identifying where these errors occur is critical for ensuring the reliability and safety of AI systems. In this paper, we propose an approach that automatically generates questions designed to del...

#ArXiv#Machine Learning#Academic

Tool• Mar 10, 2026

Switchable Activation Networks

arXiv:2603.06601v1 Announce Type: new Abstract: Deep neural networks, and more recently large-scale generative models such as large language models (LLMs) and large vision-action models (LVAs), achieve remarkable performance across diverse domains, yet their prohibitive computational cost hinders d...

#ArXiv#Machine Learning#Academic

Tool• Mar 10, 2026

Autonomous AI Agents for Option Hedging: Enhancing Financial Stability through Shortfall Aware Reinforcement Learning

arXiv:2603.06587v1 Announce Type: new Abstract: The deployment of autonomous AI agents in derivatives markets has widened a practical gap between static model calibration and realized hedging outcomes. We introduce two reinforcement learning frameworks, a novel Replication Learning of Option Pricin...

#ArXiv#Machine Learning#Academic

Tool• Mar 10, 2026

Scaling Strategy, Not Compute: A Stand-Alone, Open-Source StarCraft II Benchmark for Accessible Reinforcement Learning Research

arXiv:2603.06608v1 Announce Type: new Abstract: The research community lacks a middle ground between StarCraft IIs full game and its mini-games. The full-games sprawling state-action space renders reward signals sparse and noisy, but in mini-games simple agents saturate performance. This complexity...

#ArXiv#Machine Learning#Academic

Tool• Mar 10, 2026

MultiGen: Level-Design for Editable Multiplayer Worlds in Diffusion Game Engines

arXiv:2603.06679v1 Announce Type: new Abstract: Video world models have shown immense promise for interactive simulation and entertainment, but current systems still struggle with two important aspects of interactivity: user control over the environment for reproducible, editable experiences, and s...

#ArXiv#Machine Learning#Academic

Tool• Mar 10, 2026

Breaking the Martingale Curse: Multi-Agent Debate via Asymmetric Cognitive Potential Energy

arXiv:2603.06801v1 Announce Type: new Abstract: Multi-Agent Debate (MAD) has emerged as a promising paradigm for enhancing large language model reasoning. However, recent work reveals a limitation:standard MAD cannot improve belief correctness beyond majority voting; we refer to this as the Marting...

#ArXiv#Machine Learning#Academic

Tool• Mar 10, 2026

Stanford Scholars Train Generative AI To Be Better Creative Collaborators

The team is building a shared “conceptual grounding” so that artists can steer models with precision.

#Stanford#HAI#Ethics

Tool• Mar 9, 2026

How AstraZeneca is quietly rewiring Boston’s AI ecosystem

For AI professionals tired of hype decks and stalled pilots, AstraZeneca’s Boston strategy offers a practical blueprint for making AI work in complex, regulated environments.

#AI Accelerator Institute#AI#Research

Tool• Mar 9, 2026

Traversal-as-Policy: Log-Distilled Gated Behavior Trees as Externalized, Verifiable Policies for Safe, Robust, and Efficient Agents

arXiv:2603.05517v1 Announce Type: new Abstract: Autonomous LLM agents fail because long-horizon policy remains implicit in model weights and transcripts, while safety is retrofitted post hoc. We propose Traversal-as-Policy: distill sandboxed OpenHands execution logs into a single executable Gated B...

#ArXiv#Machine Learning#Academic

Tool• Mar 9, 2026

JAWS: Enhancing Long-term Rollout of Neural Operators via Spatially-Adaptive Jacobian Regularization

arXiv:2603.05538v1 Announce Type: new Abstract: Data-driven surrogate models improve the efficiency of simulating continuous dynamical systems, yet their autoregressive rollouts are often limited by instability and spectral blow-up. While global regularization techniques can enforce contractive dyn...

#ArXiv#Machine Learning#Academic

Tool• Mar 9, 2026

VDCook:DIY video data cook your MLLMs

arXiv:2603.05539v1 Announce Type: new Abstract: We introduce VDCook: a self-evolving video data operating system, a configurable video data construction platform for researchers and vertical domain teams. Users initiate data requests via natural language queries and adjustable parameters (scale, re...

#ArXiv#Machine Learning#Academic

← Prev

1...12 13 14 15 16...37