AI Tools & Frameworks Directory

Discover essential tools, libraries, and frameworks to power your AI workflows.

Tool• Mar 25, 2026

Exclusive Self Attention

We introduce exclusive self attention (XSA), a simple modification of self attention (SA) that improves Transformer’s sequence modeling performance. The key idea is to constrain attention to capture only information orthogonal to the token’s own value vector (thus excluding information of self posit...

#Apple#On-device AI

Tool• Mar 24, 2026

The problem with AI explaining AI

What if AI could explain itself? As language models scale in size and complexity, that possibility has drawn growing excitement, and hope. But new research from MIT, Technion, and Northeastern University suggests the reality is much messier, and more concerning...

#AI Accelerator Institute#AI#Research

Tool• Mar 24, 2026

AgenticGEO: A Self-Evolving Agentic System for Generative Engine Optimization

arXiv:2603.20213v1 Announce Type: new Abstract: Generative search engines represent a transition from traditional ranking-based retrieval to Large Language Model (LLM)-based synthesis, transforming optimization goals from ranking prominence towards content inclusion. Generative Engine Optimization ...

#ArXiv#Machine Learning#Academic

Tool• Mar 24, 2026

ProMAS: Proactive Error Forecasting for Multi-Agent Systems Using Markov Transition Dynamics

arXiv:2603.20260v1 Announce Type: new Abstract: The integration of Large Language Models into Multi-Agent Systems (MAS) has enabled the so-lution of complex, long-horizon tasks through collaborative reasoning. However, this collec-tive intelligence is inherently fragile, as a single logical fallacy...

#ArXiv#Machine Learning#Academic

Tool• Mar 24, 2026

Domain-Specialized Tree of Thought through Plug-and-Play Predictors

arXiv:2603.20267v1 Announce Type: new Abstract: While Large Language Models (LLMs) have advanced complex reasoning, prominent methods like the Tree of Thoughts (ToT) framework face a critical trade-off between exploration depth and computational efficiency. Existing ToT implementations often rely o...

#ArXiv#Machine Learning#Academic

Tool• Mar 24, 2026

FactorSmith: Agentic Simulation Generation via Markov Decision Process Decomposition with Planner-Designer-Critic Refinement

arXiv:2603.20270v1 Announce Type: new Abstract: Generating executable simulations from natural language specifications remains a challenging problem due to the limited reasoning capacity of large language models (LLMs) when confronted with large, interconnected codebases. This paper presents Factor...

#ArXiv#Machine Learning#Academic

Tool• Mar 24, 2026

Me, Myself, and $\pi$ : Evaluating and Explaining LLM Introspection

arXiv:2603.20276v1 Announce Type: new Abstract: A hallmark of human intelligence is Introspection-the ability to assess and reason about one's own cognitive processes. Introspection has emerged as a promising but contested capability in large language models (LLMs). However, current evaluations oft...

#ArXiv#Machine Learning#Academic

Tool• Mar 24, 2026

JointFM-0.1: A Foundation Model for Multi-Target Joint Distributional Prediction

arXiv:2603.20266v1 Announce Type: new Abstract: Despite the rapid advancements in Artificial Intelligence (AI), Stochastic Differential Equations (SDEs) remain the gold-standard formalism for modeling systems under uncertainty. However, applying SDEs in practice is fraught with challenges: modeling...

#ArXiv#Machine Learning#Academic

Tool• Mar 24, 2026

MARLIN: Multi-Agent Reinforcement Learning for Incremental DAG Discovery

arXiv:2603.20295v1 Announce Type: new Abstract: Uncovering causal structures from observational data is crucial for understanding complex systems and making informed decisions. While reinforcement learning (RL) has shown promise in identifying these structures in the form of a directed acyclic grap...

#ArXiv#Machine Learning#Academic

Tool• Mar 24, 2026

Collaborative Adaptive Curriculum for Progressive Knowledge Distillation

arXiv:2603.20296v1 Announce Type: new Abstract: Recent advances in collaborative knowledge distillation have demonstrated cutting-edge performance for resource-constrained distributed multimedia learning scenarios. However, achieving such competitiveness requires addressing a fundamental mismatch: ...

#ArXiv#Machine Learning#Academic

Tool• Mar 24, 2026

Rolling-Origin Validation Reverses Model Rankings in Multi-Step PM10 Forecasting: XGBoost, SARIMA, and Persistence

arXiv:2603.20315v1 Announce Type: new Abstract: (a) Many air quality forecasting studies report gains from machine learning, but evaluations often use static chronological splits and omit persistence baselines, so the operational added value under routine updating is unclear. (b) Using 2,350 dail...

#ArXiv#Machine Learning#Academic

Tool• Mar 24, 2026

Trained on Tokens, Calibrated on Concepts: The Emergence of Semantic Calibration in LLMs

Large Language Models (LLMs) often lack meaningful confidence estimates for their outputs. While base LLMs are known to exhibit next-token calibration, it remains unclear whether they can assess confidence in the actual meaning of their responses beyond the token level. We find that, when using a ce...

#Apple#On-device AI

Tool• Mar 24, 2026

Scaling Synthetic Task Generation for Agents via Exploration

Post-Training Multimodal Large Language Models (MLLMs) to build interactive agents holds promise across domains such as computer-use, web navigation, and robotics. A key challenge in scaling such post-training is lack of high-quality downstream agentic task datasets with tasks that are diverse, feas...

#Apple#On-device AI

Tool• Mar 23, 2026

Will machines ever be intelligent?

Are machines truly intelligent? AI researchers Subutai Ahmad and Nicolò Fusi join Doug Burger to compare transformer-based AI with the human brain, exploring continual learning, efficiency, and whether today’s models are on a path toward human intelligence. The post Will machines ever be intelligent...

#Microsoft#Research

Tool• Mar 23, 2026

Speculating Experts Accelerates Inference for Mixture-of-Experts

arXiv:2603.19289v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) models have gained popularity as a means of scaling the capacity of large language models (LLMs) while maintaining sparse activations and reduced per-token compute. However, in memory-constrained inference settings, expert wei...

#ArXiv#Machine Learning#Academic

Tool• Mar 23, 2026

A Visualization for Comparative Analysis of Regression Models

arXiv:2603.19291v1 Announce Type: new Abstract: As regression is a widely studied problem, many methods have been proposed to solve it, each of them often requiring setting different hyper-parameters. Therefore, selecting the proper method for a given application may be very difficult and relies on...

#ArXiv#Machine Learning#Academic

Tool• Mar 23, 2026

Maximizing mutual information between user-contexts and responses improve LLM personalization with no additional data

arXiv:2603.19294v1 Announce Type: new Abstract: While post-training has successfully improved large language models (LLMs) across a variety of domains, these gains heavily rely on human-labeled data or external verifiers. Existing data has already been exploited, and new high-quality data is expens...

#ArXiv#Machine Learning#Academic

Tool• Mar 23, 2026

TTQ: Activation-Aware Test-Time Quantization to Accelerate LLM Inference On The Fly

arXiv:2603.19296v1 Announce Type: new Abstract: To tackle the huge computational demand of large foundation models, activation-aware compression techniques without retraining have been introduced. However, since these methods highly rely on calibration data, domain shift issues may arise for unseen...

#ArXiv#Machine Learning#Academic

Tool• Mar 23, 2026

When both Grounding and not Grounding are Bad -- A Partially Grounded Encoding of Planning into SAT (Extended Version)

arXiv:2603.19429v1 Announce Type: new Abstract: Classical planning problems are typically defined using lifted first-order representations, which offer compactness and generality. While most planners ground these representations to simplify reasoning, this can cause an exponential blowup in size. R...

#ArXiv#Machine Learning#Academic

Tool• Mar 23, 2026

Hyperagents

arXiv:2603.19461v1 Announce Type: new Abstract: Self-improving AI systems aim to reduce reliance on human engineering by learning to improve their own learning and problem-solving processes. Existing approaches to self-improvement rely on fixed, handcrafted meta-level mechanisms, fundamentally limi...

#ArXiv#Machine Learning#Academic

Tool• Mar 23, 2026

Teaching an Agent to Sketch One Part at a Time

arXiv:2603.19500v1 Announce Type: new Abstract: We develop a method for producing vector sketches one part at a time. To do this, we train a multi-modal language model-based agent using a novel multi-turn process-reward reinforcement learning following supervised fine-tuning. Our approach is enable...

#ArXiv#Machine Learning#Academic

Tool• Mar 23, 2026

Learning to Disprove: Formal Counterexample Generation with Large Language Models

arXiv:2603.19514v1 Announce Type: new Abstract: Mathematical reasoning demands two critical, complementary skills: constructing rigorous proofs for true statements and discovering counterexamples that disprove false ones. However, current AI efforts in mathematics focus almost exclusively on proof ...

#ArXiv#Machine Learning#Academic

Tool• Mar 23, 2026

ItinBench: Benchmarking Planning Across Multiple Cognitive Dimensions with Large Language Models

arXiv:2603.19515v1 Announce Type: new Abstract: Large language models (LLMs) with advanced cognitive capabilities are emerging as agents for various reasoning and planning tasks. Traditional evaluations often focus on specific reasoning or planning questions within controlled environments. Recent s...

#ArXiv#Machine Learning#Academic

Tool• Mar 23, 2026

Optimal Splitting of Language Models from Mixtures to Specialized Domains

This paper was accepted at the Workshop on Navigating and Addressing Data Problems for Foundation Models at ICLR 2026. Language models achieve impressive performance on a variety of knowledge, language, and reasoning tasks due to the scale and diversity of pretraining data available. The standard tr...

#Apple#On-device AI

← Prev

1...8 9 10 11 12...37