AI Tools & Frameworks Directory

Discover essential tools, libraries, and frameworks to power your AI workflows.

Tool• May 12, 2026

Spatial Priming Outperforms Semantic Prompting: A Grid-Based Approach to Improving LLM Accuracy on Chart Data Extraction

arXiv:2605.08220v1 Announce Type: new Abstract: The automated extraction of data from scientific charts is a critical task for large-scale literature analysis. While multimodal Large Language Models (LLMs) show promise, their accuracy on non-standardized charts remains a challenge. This raises a ke...

#ArXiv#Machine Learning#Academic

Tool• May 12, 2026

Auto-Rubric as Reward: From Implicit Preferences to Explicit Multimodal Generative Criteria

arXiv:2605.08354v1 Announce Type: new Abstract: Aligning multimodal generative models with human preferences demands reward signals that respect the compositional, multi-dimensional structure of human judgment. Prevailing RLHF approaches reduce this structure to scalar or pairwise labels, collapsin...

#ArXiv#Machine Learning#Academic

Tool• May 12, 2026

On Distinguishing Capability Elicitation from Capability Creation in Post-Training: A Free-Energy Perspective

arXiv:2605.08368v1 Announce Type: new Abstract: Debates about large language model post-training often treat supervised fine-tuning (SFT) as imitation and reinforcement learning (RL) as discovery. But this distinction is too coarse. What matters is whether a training procedure increases the probabi...

#ArXiv#Machine Learning#Academic

Tool• May 11, 2026

RateQuant: Optimal Mixed-Precision KV Cache Quantization via Rate-Distortion Theory

arXiv:2605.06675v1 Announce Type: new Abstract: Large language models cache all previously computed key-value (KV) pairs during generation, and this KV cache grows linearly with sequence length, making it a primary memory bottleneck for serving. Quantizing the KV cache to fewer bits reduces this co...

#ArXiv#Machine Learning#Academic

Tool• May 11, 2026

LKV: End-to-End Learning of Head-wise Budgets and Token Selection for LLM KV Cache Eviction

arXiv:2605.06676v1 Announce Type: new Abstract: Long-context inference in Large Language Models (LLMs) is bottlenecked by the linear growth of Key-Value (KV) cache memory. Existing KV cache compression paradigms are fundamentally limited by heuristics: heuristic budgeting relies on statistical prio...

#ArXiv#Machine Learning#Academic

Tool• May 11, 2026

A Wasserstein GAN-based climate scenario generator for risk management and insurance: the case of soil subsidence

arXiv:2605.06678v1 Announce Type: new Abstract: According to the United Nations Office for Disaster Risk Reduction (2025), the average annual cost of natural catastrophes increased from 70--80 billion USD between 1970 and 2000 to 180--200 billion USD between 2001 and 2020. Reports from organization...

#ArXiv#Machine Learning#Academic

Tool• May 11, 2026

Breaking the Illusion: When Positive Meets Negative in Multimodal Decoding

arXiv:2605.06679v1 Announce Type: new Abstract: Vision-Language Models (VLMs) are frequently undermined by object hallucination, generating content that contradicts visual reality, due to an over-reliance on linguistic priors. We introduce Positive-and-Negative Decoding (PND), a training-free infer...

#ArXiv#Machine Learning#Academic

Tool• May 11, 2026

GraphDC: A Divide-and-Conquer Multi-Agent System for Scalable Graph Algorithm Reasoning

arXiv:2605.06671v1 Announce Type: new Abstract: Large Language Models (LLMs) have demonstrated strong potential for many mathematical problems. However, their performance on graph algorithmic tasks is still unsatisfying, since graphs are naturally more complex in topology and often require systemat...

#ArXiv#Machine Learning#Academic

Tool• May 11, 2026

More Thinking, More Bias: Length-Driven Position Bias in Reasoning Models

arXiv:2605.06672v1 Announce Type: new Abstract: Chain-of-thought (CoT) reasoning and reasoning-tuned models such as DeepSeek-R1 are commonly assumed to reduce shallow heuristic biases by thinking carefully. We test this on position bias in multiple-choice QA and find a different story: within any r...

#ArXiv#Machine Learning#Academic

Tool• May 11, 2026

Fast and Effective Redistricting Optimization via Composite-Move Tabu Search

arXiv:2605.06682v1 Announce Type: new Abstract: Spatial redistricting is a practical combinatorial optimization problem that demands high-quality solutions, rapid turnaround, and flexibility to accommodate multi-criteria objectives and interactive refinement. A central challenge is the contiguity c...

#ArXiv#Machine Learning#Academic

Tool• May 11, 2026

State Representation and Termination for Recursive Reasoning Systems

arXiv:2605.06690v1 Announce Type: new Abstract: Recursive reasoning systems alternate between acquiring new evidence and refining an accumulated understanding. Two design choices are typically left implicit: how to represent the evolving reasoning state, and when to stop iterating. This paper addre...

#ArXiv#Machine Learning#Academic

Tool• May 11, 2026

Hidden Coalitions in Multi-Agent AI: A Spectral Diagnostic from Internal Representations

arXiv:2605.06696v1 Announce Type: new Abstract: Collections of interacting AI agents can form coalitions, creating emergent group-level organization that is critical for AI safety and alignment. However, observing agent behavior alone is often insufficient to distinguish genuine informational coupl...

#ArXiv#Machine Learning#Academic

Tool• May 11, 2026

BalCapRL: A Balanced Framework for RL-Based MLLM Image Captioning

Image captioning is one of the most fundamental tasks in computer vision. Owing to its open-ended nature, it has received significant attention in the era of multimodal large language models (MLLMs). In pursuit of ever more detailed and accurate captions, recent work has increasingly turned to reinf...

#Apple#On-device AI

Tool• May 9, 2026

Understanding Annotator Safety Policy with Interpretability

arXiv:2605.05329v1 Announce Type: new Abstract: Safety policies define what constitutes safe and unsafe AI outputs, guiding data annotation and model development. However, annotation disagreement is pervasive and can stem from multiple sources such as operational failures (annotators misunderstand ...

#ArXiv#Machine Learning#Academic

Tool• May 9, 2026

Partial Evidence Bench: Benchmarking Authorization-Limited Evidence in Agentic Systems

arXiv:2605.05379v1 Announce Type: new Abstract: Enterprise agents increasingly operate inside scoped retrieval systems, delegated workflows, and policy-constrained evidence environments. In these settings, access control can be enforced correctly while the system still produces an answer that appea...

#ArXiv#Machine Learning#Academic

Tool• May 9, 2026

BALAR : A Bayesian Agentic Loop for Active Reasoning

arXiv:2605.05386v1 Announce Type: new Abstract: Large language models increasingly operate in interactive settings where solving a task requires multiple rounds of information exchange with a user. However, most current systems treat dialogue reactively and lack a principled mechanism to reason abo...

#ArXiv#Machine Learning#Academic

Tool• May 9, 2026

Intelligent CCTV for Urban Design: AI-Based Analysis of Soft Infrastructure at Intersections

arXiv:2605.05402v1 Announce Type: new Abstract: Artificial intelligence (AI) and computer vision are transforming transportation data collection. This study introduces an AI-enabled analytics framework leveraging existing CCTV infrastructure to evaluate the impact of soft interventions, such as tem...

#ArXiv#Machine Learning#Academic

Tool• May 8, 2026

Building realistic electric transmission grid dataset at scale: a pipeline from open dataset

Microsoft Research is excited to release an open dataset of approximate transmission topology of the U.S. power grid derived from publicly available data. The ability to study transmission-level power grid behavior is essential for modern power systems research. Analyses of congestion, transmission ...

#Microsoft#Research

Tool• May 8, 2026

Becoming AI ready: Building a company with 12 AI agents as my first hires

I left Google ten days ago to found my own company. It's been quite a journey figuring out how things work outside of the mothership, and I'm genuinely excited to share what I've learned from both sides of the house...

#AI Accelerator Institute#AI#Research

Tool• May 8, 2026

7 signs your AI agent system needs to start building its own tools

Most AI agents are stuck in their ways. Built once, they repeat the same patterns regardless of the task at hand. But new research suggests a smarter path forward: agents that get sharper with every challenge they face...

#AI Accelerator Institute#AI#Research

Tool• May 8, 2026

Adaptive Parallel Reasoning: The Next Paradigm in Efficient Inference Scaling

Overview of adaptive parallel reasoning. What if a reasoning model could decide for itself when to decompose and parallelize independent subtasks, how many concurrent threads to spawn, and how to coordinate them based on the problem at hand? We provide a detailed analysis of recent progress in the...

#Berkeley#BAIR#Academic

Tool• May 8, 2026

Nationwide EHR-Based Chronic Rhinosinusitis Prediction Using Demographic-Stratified Models

arXiv:2605.05213v1 Announce Type: new Abstract: Chronic rhinosinusitis (CRS) is a common heterogeneous inflammatory disorder that causes substantial morbidity and healthcare costs. CRS is difficult to identify early from routine encounters, as symptom presentations overlap with common conditions su...

#ArXiv#Machine Learning#Academic

Tool• May 8, 2026

SAT: Sequential Agent Tuning for Coordinator Free Plug and Play Multi-LLM Training with Monotonic Improvement Guarantees

arXiv:2605.05216v1 Announce Type: new Abstract: Large language models (LLMs) with a large number of parameters achieve strong performance but are often prohibitively expensive to deploy. Recent work explores using teams of smaller, more efficient LLMs that collectively match or even outperform a si...

#ArXiv#Machine Learning#Academic

Tool• May 8, 2026

Physics-Informed Neural Networks with Learnable Loss Balancing and Transfer Learning

arXiv:2605.05217v1 Announce Type: new Abstract: We propose a self-supervised physics-informed neural network (PINN) framework that adaptively balances physics-based and data-driven supervision for scientific machine learning under data scarcity. Unlike prior PINNs that rely on fixed or heuristic we...

#ArXiv#Machine Learning#Academic

← Prev

1...20 21 22 23 24...63