Mechanistic Interpretability of EEG Foundation Models via Sparse Autoencoders
arXiv:2605.13930v1 Announce Type: new
Abstract: EEG foundation models achieve state-of-the-art clinical performance, yet the internal computations driving their predictions remain opaque: a barrier to clinical trust. We apply TopK Sparse Autoencoders (SAEs) across three architecturally distinct EEG...
Learning to Decide with AI Assistance under Human-Alignment
arXiv:2605.12646v1 Announce Type: new
Abstract: It is widely agreed that when AI models assist decision-makers in high-stakes domains by predicting an outcome of interest, they should communicate the confidence of their predictions. However, empirical evidence suggests that decision-makers often st...
OceanCBM: A Concept Bottleneck Model for Mechanistic Interpretability in Ocean Forecasting
arXiv:2605.12639v1 Announce Type: new
Abstract: Extreme ocean phenomena are challenging not only to predict but to diagnose, as accurate forecasts alone do not reveal the underlying physical drivers. While recent machine learning approaches achieve strong predictive skill, they remain largely opaqu...
Learning When to Act: Communication-Efficient Reinforcement Learning via Run-Time Assurance
arXiv:2605.12561v1 Announce Type: new
Abstract: Safe reinforcement learning (RL) typically asks $\textit{what}$ an agent should do. We ask $\textit{when}$ it needs to act, and show that a single policy can jointly learn control inputs and communication-efficient timing decisions under a pointwise L...
Learning Transferable Latent User Preferences for Human-Aligned Decision Making
arXiv:2605.12682v1 Announce Type: new
Abstract: Large language models (LLMs) are increasingly used as reasoning modules in many applications. While they are efficient in certain tasks, LLMs often struggle to produce human-aligned solutions. Human-aligned decision making requires accounting for both...
arXiv:2605.12674v1 Announce Type: new
Abstract: Vision-Language Models (VLMs) are increasingly used in safety-critical applications because of their broad reasoning capabilities and ability to generalize with minimal task-specific engineering. Despite these advantages, they can exhibit catastrophic...
Do Androids Dream of Breaking the Game? Systematically Auditing AI Agent Benchmarks with BenchJack
arXiv:2605.12673v1 Announce Type: new
Abstract: Agent benchmarks have become the de facto measure of frontier AI competence, guiding model selection, investment, and deployment. However, reward hacking, where agents maximize a score without performing the intended task, emerges spontaneously in fro...
Macro-Action Based Multi-Agent Instruction Following through Value Cancellation
arXiv:2605.12655v1 Announce Type: new
Abstract: Multi-agent reinforcement learning (MARL) in real-world use cases may need to adapt to external natural language instructions that interrupt ongoing behavior and conflict with long-horizon objectives. However, conditioning rewards on instructions intr...
mimalloc: A high-performance, scalable memory allocator for the modern era
mimalloc is an open-source, modern, scalable memory allocator that is a drop-in replacement for malloc and free. It is relatively small (~12K lines), with clear internal data structures, and is easy to build and integrate into other projects. It provides bounded worst-case allocation times (up to OS...
GridSFM: A new, small foundation model for the electric grid
Introducing GridSFM, a small foundation model that can predict AC optimal power flow in milliseconds, boosting efficiency and unlocking cost savings. Learn how GridSFM gives grid operators direct visibility into congestion, stability, and system health.
The post GridSFM: A new, small foundation mode...
The Many Faces of On-Policy Distillation: Pitfalls, Mechanisms, and Fixes
arXiv:2605.11182v1 Announce Type: new
Abstract: On-policy distillation (OPD) and on-policy self-distillation (OPSD) have emerged as promising post-training methods for large language models, offering dense token-level supervision on trajectories sampled from the model's own policy. However, existin...
OLIVIA: Online Learning via Inference-time Action Adaptation for Decision Making in LLM ReAct Agents
arXiv:2605.11169v1 Announce Type: new
Abstract: Large language model agents interleave reasoning, action selection, and observation to solve sequential decision-making tasks. In deployed settings where agents repeatedly handle related multi-step tasks, small action-selection errors can accumulate i...
EVOCHAMBER: Test-Time Co-evolution of Multi-Agent System at Individual, Team, and Population Scales
arXiv:2605.11136v1 Announce Type: new
Abstract: We argue that multi-agent test-time evolution is not single-agent evolution replicated N times. A single-agent learner can only evolve its own context and memory. A multi-agent system additionally evolves who collaborates, how they collaborate, and ho...
Vertex-Softmax: Tight Transformer Verification via Exact Softmax Optimization
arXiv:2605.10974v1 Announce Type: new
Abstract: Certified verification of transformer attention requires bounding the softmax function over interval constraints on the pre-softmax scores. Existing verifiers relax softmax ndependently of the downstream objective, leaving avoidable slack. We prove th...
Steering Without Breaking: Mechanistically Informed Interventions for Discrete Diffusion Language Models
arXiv:2605.10971v1 Announce Type: new
Abstract: Discrete diffusion language models (DLMs) generate text by iteratively denoising all positions in parallel, offering an alternative to autoregressive models. Controlled generation methods for DLMs, imported from autoregressive models, apply uniform in...
QuIDE: Mastering the Quantized Intelligence Trade-off via Active Optimization
arXiv:2605.10959v1 Announce Type: new
Abstract: There is currently no unified metric for evaluating the efficiency of quantized neural networks. We propose QuIDE, built around the Intelligence Index I = (C x P)/log_2(T+1), which collapses the compression-accuracy-latency trade-off into a single sco...
Advancing AI for materials with MatterSim: experimental synthesis, faster simulation, and multi-task models
MatterSim is expanding what AI can do for materials science—from faster large-scale simulations to MatterSim-MT, a new multi-task model for simulating properties beyond potential energy surfaces alone.
The post Advancing AI for materials with MatterSim: experimental synthesis, faster simulation, and...
8 ways self-evolving AI agents are about to change how we build software
A new paper out of arXiv this week describes an AI system that builds, improves, and deploys its own specialist agents. Here is what that actually means for engineers and technical teams.
On Distinguishing Capability Elicitation from Capability Creation in Post-Training: A Free-Energy Perspective
arXiv:2605.08368v1 Announce Type: new
Abstract: Debates about large language model post-training often treat supervised fine-tuning (SFT) as imitation and reinforcement learning (RL) as discovery. But this distinction is too coarse. What matters is whether a training procedure increases the probabi...
Auto-Rubric as Reward: From Implicit Preferences to Explicit Multimodal Generative Criteria
arXiv:2605.08354v1 Announce Type: new
Abstract: Aligning multimodal generative models with human preferences demands reward signals that respect the compositional, multi-dimensional structure of human judgment. Prevailing RLHF approaches reduce this structure to scalar or pairwise labels, collapsin...
Spatial Priming Outperforms Semantic Prompting: A Grid-Based Approach to Improving LLM Accuracy on Chart Data Extraction
arXiv:2605.08220v1 Announce Type: new
Abstract: The automated extraction of data from scientific charts is a critical task for large-scale literature analysis. While multimodal Large Language Models (LLMs) show promise, their accuracy on non-standardized charts remains a challenge. This raises a ke...
BaLoRA: Bayesian Low-Rank Adaptation of Large Scale Models
arXiv:2605.08110v1 Announce Type: new
Abstract: Low-Rank Adaptation (LoRA) has become the standard for fine-tuning large pre-trained models at reduced computational cost. However, its low-rank point-estimate updates limit expressiveness, leave a persistent gap relative to full fine-tuning accuracy,...
Geometry-free prediction of inertial lift forces in microfluidic devices using deep learning
arXiv:2605.08109v1 Announce Type: new
Abstract: Inertial microfluidic devices (IMDs) offer low-cost, high-throughput alternative techniques for many traditional particle- (or cell-) manipulation tasks, but simulating them requires being able to predict particle migration, and thus particle lift for...
Path-Based Gradient Boosting for Graph-Level Prediction
arXiv:2605.08102v1 Announce Type: new
Abstract: We propose PathBoost, a gradient tree boosting method for graph-level classification and regression that learns discriminative path-based features directly from the input graph structure. Building on a previous work, which was tailored to a specific c...