AI Tools & Frameworks Directory

Discover essential tools, libraries, and frameworks to power your AI workflows.

Tool• May 15, 2026

Rethinking Molecular OOD Generalization via Target-Aware Source Selection

arXiv:2605.13932v1 Announce Type: new Abstract: Robust prediction of molecular properties under extreme out-of-distribution (OOD) scenarios is a pivotal bottleneck in AI-driven drug discovery. Current scaffold-splitting protocols fail to obstruct microscopic semantic overlap, predisposing models to...

#ArXiv#Machine Learning#Academic

Tool• May 15, 2026

Unsupervised learning of acquisition variability in structural connectomes via hybrid latent space modeling

arXiv:2605.13933v1 Announce Type: new Abstract: Acquisition differences across sites, scanners, and protocols in dMRI introduce variability that complicates structural connectome analysis. This motivates deep learning models that can represent high-dimensional connectomes in a low-dimensional space...

#ArXiv#Machine Learning#Academic

Tool• May 15, 2026

Beyond Mode-Seeking RL: Trajectory-Balance Post-Training for Diffusion Language Models

arXiv:2605.13935v1 Announce Type: new Abstract: Diffusion language models are a promising alternative to autoregressive models, yet post-training methods for them largely adapt reward-maximizing objectives. We identify a central failure mode in this setting we call trajectory locking: sampled rewar...

#ArXiv#Machine Learning#Academic

Tool• May 14, 2026

Macro-Action Based Multi-Agent Instruction Following through Value Cancellation

arXiv:2605.12655v1 Announce Type: new Abstract: Multi-agent reinforcement learning (MARL) in real-world use cases may need to adapt to external natural language instructions that interrupt ongoing behavior and conflict with long-horizon objectives. However, conditioning rewards on instructions intr...

#ArXiv#Machine Learning#Academic

Tool• May 14, 2026

Do Androids Dream of Breaking the Game? Systematically Auditing AI Agent Benchmarks with BenchJack

arXiv:2605.12673v1 Announce Type: new Abstract: Agent benchmarks have become the de facto measure of frontier AI competence, guiding model selection, investment, and deployment. However, reward hacking, where agents maximize a score without performing the intended task, emerges spontaneously in fro...

#ArXiv#Machine Learning#Academic

Tool• May 14, 2026

Revealing Interpretable Failure Modes of VLMs

arXiv:2605.12674v1 Announce Type: new Abstract: Vision-Language Models (VLMs) are increasingly used in safety-critical applications because of their broad reasoning capabilities and ability to generalize with minimal task-specific engineering. Despite these advantages, they can exhibit catastrophic...

#ArXiv#Machine Learning#Academic

Tool• May 14, 2026

Learning Transferable Latent User Preferences for Human-Aligned Decision Making

arXiv:2605.12682v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used as reasoning modules in many applications. While they are efficient in certain tasks, LLMs often struggle to produce human-aligned solutions. Human-aligned decision making requires accounting for both...

#ArXiv#Machine Learning#Academic

Tool• May 14, 2026

Learning When to Act: Communication-Efficient Reinforcement Learning via Run-Time Assurance

arXiv:2605.12561v1 Announce Type: new Abstract: Safe reinforcement learning (RL) typically asks $\textit{what}$ an agent should do. We ask $\textit{when}$ it needs to act, and show that a single policy can jointly learn control inputs and communication-efficient timing decisions under a pointwise L...

#ArXiv#Machine Learning#Academic

Tool• May 14, 2026

OceanCBM: A Concept Bottleneck Model for Mechanistic Interpretability in Ocean Forecasting

arXiv:2605.12639v1 Announce Type: new Abstract: Extreme ocean phenomena are challenging not only to predict but to diagnose, as accurate forecasts alone do not reveal the underlying physical drivers. While recent machine learning approaches achieve strong predictive skill, they remain largely opaqu...

#ArXiv#Machine Learning#Academic

Tool• May 14, 2026

Learning to Decide with AI Assistance under Human-Alignment

arXiv:2605.12646v1 Announce Type: new Abstract: It is widely agreed that when AI models assist decision-makers in high-stakes domains by predicting an outcome of interest, they should communicate the confidence of their predictions. However, empirical evidence suggests that decision-makers often st...

#ArXiv#Machine Learning#Academic

Tool• May 13, 2026

mimalloc: A high-performance, scalable memory allocator for the modern era

mimalloc is an open-source, modern, scalable memory allocator that is a drop-in replacement for malloc and free. It is relatively small (~12K lines), with clear internal data structures, and is easy to build and integrate into other projects. It provides bounded worst-case allocation times (up to OS...

#Microsoft#Research

Tool• May 13, 2026

GridSFM: A new, small foundation model for the electric grid

Introducing GridSFM, a small foundation model that can predict AC optimal power flow in milliseconds, boosting efficiency and unlocking cost savings. Learn how GridSFM gives grid operators direct visibility into congestion, stability, and system health. The post GridSFM: A new, small foundation mode...

#Microsoft#Research

Tool• May 13, 2026

QuIDE: Mastering the Quantized Intelligence Trade-off via Active Optimization

arXiv:2605.10959v1 Announce Type: new Abstract: There is currently no unified metric for evaluating the efficiency of quantized neural networks. We propose QuIDE, built around the Intelligence Index I = (C x P)/log_2(T+1), which collapses the compression-accuracy-latency trade-off into a single sco...

#ArXiv#Machine Learning#Academic

Tool• May 13, 2026

Steering Without Breaking: Mechanistically Informed Interventions for Discrete Diffusion Language Models

arXiv:2605.10971v1 Announce Type: new Abstract: Discrete diffusion language models (DLMs) generate text by iteratively denoising all positions in parallel, offering an alternative to autoregressive models. Controlled generation methods for DLMs, imported from autoregressive models, apply uniform in...

#ArXiv#Machine Learning#Academic

Tool• May 13, 2026

Vertex-Softmax: Tight Transformer Verification via Exact Softmax Optimization

arXiv:2605.10974v1 Announce Type: new Abstract: Certified verification of transformer attention requires bounding the softmax function over interval constraints on the pre-softmax scores. Existing verifiers relax softmax ndependently of the downstream objective, leaving avoidable slack. We prove th...

#ArXiv#Machine Learning#Academic

Tool• May 13, 2026

EVOCHAMBER: Test-Time Co-evolution of Multi-Agent System at Individual, Team, and Population Scales

arXiv:2605.11136v1 Announce Type: new Abstract: We argue that multi-agent test-time evolution is not single-agent evolution replicated N times. A single-agent learner can only evolve its own context and memory. A multi-agent system additionally evolves who collaborates, how they collaborate, and ho...

#ArXiv#Machine Learning#Academic

Tool• May 13, 2026

OLIVIA: Online Learning via Inference-time Action Adaptation for Decision Making in LLM ReAct Agents

arXiv:2605.11169v1 Announce Type: new Abstract: Large language model agents interleave reasoning, action selection, and observation to solve sequential decision-making tasks. In deployed settings where agents repeatedly handle related multi-step tasks, small action-selection errors can accumulate i...

#ArXiv#Machine Learning#Academic

Tool• May 13, 2026

The Many Faces of On-Policy Distillation: Pitfalls, Mechanisms, and Fixes

arXiv:2605.11182v1 Announce Type: new Abstract: On-policy distillation (OPD) and on-policy self-distillation (OPSD) have emerged as promising post-training methods for large language models, offering dense token-level supervision on trajectories sampled from the model's own policy. However, existin...

#ArXiv#Machine Learning#Academic

Tool• May 12, 2026

Advancing AI for materials with MatterSim: experimental synthesis, faster simulation, and multi-task models

MatterSim is expanding what AI can do for materials science—from faster large-scale simulations to MatterSim-MT, a new multi-task model for simulating properties beyond potential energy surfaces alone. The post Advancing AI for materials with MatterSim: experimental synthesis, faster simulation, and...

#Microsoft#Research

Tool• May 12, 2026

8 ways self-evolving AI agents are about to change how we build software

A new paper out of arXiv this week describes an AI system that builds, improves, and deploys its own specialist agents. Here is what that actually means for engineers and technical teams.

#AI Accelerator Institute#AI#Research

Tool• May 12, 2026

Reinforcement learning for inverse structural design and rapid laser cutting of kirigami prototypes

arXiv:2605.08098v1 Announce Type: new Abstract: Kirigami is an increasingly useful fabrication method to produce shape-programmable metamaterial structures. However, inverse design remains difficult because deployment is nonlinear, and feasible cut layouts must satisfy discrete compatibility rules,...

#ArXiv#Machine Learning#Academic

Tool• May 12, 2026

Path-Based Gradient Boosting for Graph-Level Prediction

arXiv:2605.08102v1 Announce Type: new Abstract: We propose PathBoost, a gradient tree boosting method for graph-level classification and regression that learns discriminative path-based features directly from the input graph structure. Building on a previous work, which was tailored to a specific c...

#ArXiv#Machine Learning#Academic

Tool• May 12, 2026

Geometry-free prediction of inertial lift forces in microfluidic devices using deep learning

arXiv:2605.08109v1 Announce Type: new Abstract: Inertial microfluidic devices (IMDs) offer low-cost, high-throughput alternative techniques for many traditional particle- (or cell-) manipulation tasks, but simulating them requires being able to predict particle migration, and thus particle lift for...

#ArXiv#Machine Learning#Academic

Tool• May 12, 2026

BaLoRA: Bayesian Low-Rank Adaptation of Large Scale Models

arXiv:2605.08110v1 Announce Type: new Abstract: Low-Rank Adaptation (LoRA) has become the standard for fine-tuning large pre-trained models at reduced computational cost. However, its low-rank point-estimate updates limit expressiveness, leave a persistent gap relative to full fine-tuning accuracy,...

#ArXiv#Machine Learning#Academic

← Prev

1...19 20 21 22 23...63