AI Tools & Frameworks Directory

Discover essential tools, libraries, and frameworks to power your AI workflows.

Tool• Dec 9, 2026

Transparency in AI is on the Decline

A new study shows the AI industry is withholding key information.

#Stanford#HAI#Ethics

Tool• Mar 12, 2026

Meta buys Moltbook: The social network where AI agents talk to each other

Meta’s acquisition of Moltbook highlights a growing focus on agent-to-agent systems and the infrastructure required to support them. It’s a small deal that signals bigger shifts in how AI ecosystems may evolve.

#AI Accelerator Institute#AI#Research

Tool• Mar 12, 2026

Systematic debugging for AI agents: Introducing the AgentRx framework

As AI agents transition from simple chatbots to autonomous systems capable of managing cloud incidents, navigating complex web interfaces, and executing multi-step API workflows, a new challenge has emerged: transparency. When a human makes a mistake, we can usually trace the logic. But when an AI a...

#Microsoft#Research

Tool• Mar 12, 2026

Verbalizing LLM's Higher-order Uncertainty via Imprecise Probabilities

arXiv:2603.10396v1 Announce Type: new Abstract: Despite the growing demand for eliciting uncertainty from large language models (LLMs), empirical evidence suggests that LLM behavior is not always adequately captured by the elicitation techniques developed under the classical probabilistic uncertain...

#ArXiv#Machine Learning#Academic

Tool• Mar 12, 2026

Beyond Scalars: Evaluating and Understanding LLM Reasoning via Geometric Progress and Stability

arXiv:2603.10384v1 Announce Type: new Abstract: Evaluating LLM reliability via scalar probabilities often fails to capture the structural dynamics of reasoning. We introduce TRACED, a framework that assesses reasoning quality through theoretically grounded geometric kinematics. By decomposing reaso...

#ArXiv#Machine Learning#Academic

Tool• Mar 12, 2026

HEAL: Hindsight Entropy-Assisted Learning for Reasoning Distillation

arXiv:2603.10359v1 Announce Type: new Abstract: Distilling reasoning capabilities from Large Reasoning Models (LRMs) into smaller models is typically constrained by the limitation of rejection sampling. Standard methods treat the teacher as a static filter, discarding complex "corner-case" problems...

#ArXiv#Machine Learning#Academic

Tool• Mar 12, 2026

Hybrid Self-evolving Structured Memory for GUI Agents

arXiv:2603.10291v1 Announce Type: new Abstract: The remarkable progress of vision-language models (VLMs) has enabled GUI agents to interact with computers in a human-like manner. Yet real-world computer-use tasks remain difficult due to long-horizon workflows, diverse interfaces, and frequent inter...

#ArXiv#Machine Learning#Academic

Tool• Mar 12, 2026

Agentic Control Center for Data Product Optimization

arXiv:2603.10133v1 Announce Type: new Abstract: Data products enable end users to gain greater insights about their data by providing supporting assets, such as example question-SQL pairs which can be answered using the data or views over the database tables. However, producing useful data products...

#ArXiv#Machine Learning#Academic

Tool• Mar 12, 2026

LWM-Temporal: Sparse Spatio-Temporal Attention for Wireless Channel Representation Learning

arXiv:2603.10024v1 Announce Type: new Abstract: LWM-Temporal is a new member of the Large Wireless Models (LWM) family that targets the spatiotemporal nature of wireless channels. Designed as a task-agnostic foundation model, LWM-Temporal learns universal channel embeddings that capture mobility-in...

#ArXiv#Machine Learning#Academic

Tool• Mar 12, 2026

Personalized Group Relative Policy Optimization for Heterogenous Preference Alignment

arXiv:2603.10009v1 Announce Type: new Abstract: Despite their sophisticated general-purpose capabilities, Large Language Models (LLMs) often fail to align with diverse individual preferences because standard post-training methods, like Reinforcement Learning with Human Feedback (RLHF), optimize for...

#ArXiv#Machine Learning#Academic

Tool• Mar 12, 2026

MoE-SpAc: Efficient MoE Inference Based on Speculative Activation Utility in Heterogeneous Edge Scenarios

arXiv:2603.09983v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) models enable scalable performance but face severe memory constraints on edge devices. Existing offloading strategies struggle with I/O bottlenecks due to the dynamic, low-information nature of autoregressive expert activation...

#ArXiv#Machine Learning#Academic

Tool• Mar 12, 2026

Explainable LLM Unlearning Through Reasoning

arXiv:2603.09980v1 Announce Type: new Abstract: LLM unlearning is essential for mitigating safety, copyright, and privacy concerns in pre-trained large language models (LLMs). Compared to preference alignment, it offers a more explicit way by removing undesirable knowledge characterized by specific...

#ArXiv#Machine Learning#Academic

Tool• Mar 11, 2026

RAG shows its work. That’s not the same as being right.

How GenAI turns first-party data into revenue with LLM tagging, RAG traceability, and governance that protects trust.

#AI Accelerator Institute#AI#Research

Tool• Mar 11, 2026

The Google tool helping small AI models outperform the giants

What if the secret to better AI isn’t bigger models, but better tools? Researchers at Google DeepMind have shown that smaller language models can outperform larger ones when they’re given the ability to write their own code.

#AI Accelerator Institute#AI#Research

Tool• Mar 11, 2026

SPREAD: Subspace Representation Distillation for Lifelong Imitation Learning

arXiv:2603.08763v1 Announce Type: new Abstract: A key challenge in lifelong imitation learning (LIL) is enabling agents to acquire new skills from expert demonstrations while retaining prior knowledge. This requires preserving the low-dimensional manifolds and geometric structures that underlie tas...

#ArXiv#Machine Learning#Academic

Tool• Mar 11, 2026

Generalized Reduction to the Isotropy for Flexible Equivariant Neural Fields

arXiv:2603.08758v1 Announce Type: new Abstract: Many geometric learning problems require invariants on heterogeneous product spaces, i.e., products of distinct spaces carrying different group actions, where standard techniques do not directly apply. We show that, when a group $G$ acts transitively ...

#ArXiv#Machine Learning#Academic

Tool• Mar 11, 2026

Hindsight Credit Assignment for Long-Horizon LLM Agents

arXiv:2603.08754v1 Announce Type: new Abstract: Large Language Model (LLM) agents often face significant credit assignment challenges in long-horizon, multi-step tasks due to sparse rewards. Existing value-free methods, such as Group Relative Policy Optimization (GRPO), encounter two fundamental bo...

#ArXiv#Machine Learning#Academic

Tool• Mar 11, 2026

Equitable Multi-Task Learning for AI-RANs

arXiv:2603.08717v1 Announce Type: new Abstract: AI-enabled Radio Access Networks (AI-RANs) are expected to serve heterogeneous users with time-varying learning tasks over shared edge resources. Ensuring equitable inference performance across these users requires adaptive and fair learning mechanism...

#ArXiv#Machine Learning#Academic

Tool• Mar 11, 2026

AgentOS: From Application Silos to a Natural Language-Driven Data Ecosystem

arXiv:2603.08938v1 Announce Type: new Abstract: The rapid emergence of open-source, locally hosted intelligent agents marks a critical inflection point in human-computer interaction. Systems such as OpenClaw demonstrate that Large Language Model (LLM)-based agents can autonomously operate local com...

#ArXiv#Machine Learning#Academic

Tool• Mar 11, 2026

Interpretable Markov-Based Spatiotemporal Risk Surfaces for Missing-Child Search Planning with Reinforcement Learning and LLM-Based Quality Assurance

arXiv:2603.08933v1 Announce Type: new Abstract: The first 72 hours of a missing-child investigation are critical for successful recovery. However, law enforcement agencies often face fragmented, unstructured data and a lack of dynamic, geospatial predictive tools. Our system, Guardian, provides an ...

#ArXiv#Machine Learning#Academic

Tool• Mar 11, 2026

Quantifying the Accuracy and Cost Impact of Design Decisions in Budget-Constrained Agentic LLM Search

arXiv:2603.08877v1 Announce Type: new Abstract: Agentic Retrieval-Augmented Generation (RAG) systems combine iterative search, planning prompts, and retrieval backends, but deployed settings impose explicit budgets on tool calls and completion tokens. We present a controlled measurement study of ho...

#ArXiv#Machine Learning#Academic

Tool• Mar 11, 2026

LDP: An Identity-Aware Protocol for Multi-Agent LLM Systems

arXiv:2603.08852v1 Announce Type: new Abstract: As multi-agent AI systems grow in complexity, the protocols connecting them constrain their capabilities. Current protocols such as A2A and MCP do not expose model-level properties as first-class primitives, ignoring properties fundamental to effectiv...

#ArXiv#Machine Learning#Academic

Tool• Mar 11, 2026

MASEval: Extending Multi-Agent Evaluation from Models to Systems

arXiv:2603.08835v1 Announce Type: new Abstract: The rapid adoption of LLM-based agentic systems has produced a rich ecosystem of frameworks (smolagents, LangGraph, AutoGen, CAMEL, LlamaIndex, i.a.). Yet existing benchmarks are model-centric: they fix the agentic setup and do not compare other syste...

#ArXiv#Machine Learning#Academic

Tool• Mar 10, 2026

From raw interaction to reusable knowledge: Rethinking memory for AI agents

It seems counterintuitive: giving AI agents more memory can make them less effective. As interaction logs accumulate, they grow large, fill with irrelevant content, and become increasingly difficult to use. More memory means that agents must search through larger volumes of past interactions to find...

#Microsoft#Research

← Prev

1 2 3...25