Tool-Augmented Agent for Closed-loop Optimization,Simulation,and Modeling Orchestration
arXiv:2605.20190v1 Announce Type: new
Abstract: Iterative industrial design-simulation optimization is bottlenecked by the CAD-CAE semantic gap: translating simulation feedback into valid geometric edits under diverse, coupled constraints. To fill this gap, we propose COSMO-Agent (Closed-loop Optim...
SOLAR: A Self-Optimizing Open-Ended Autonomous Agent for Lifelong Learning and Continual Adaptation
arXiv:2605.20189v1 Announce Type: new
Abstract: Despite the remarkable success of large language models (LLMs), they still face bottlenecks while deploying in dynamic, real-world settings with primary challenges being concept drift and the high cost of gradient-based adaptation. Traditional fine-tu...
Provably Learning Diffusion Models under the Manifold Hypothesis: Collapse and Refine
arXiv:2605.20235v1 Announce Type: new
Abstract: Diffusion models generate high-dimensional data with remarkable quality, yet how their training efficiently learns the score function, bypassing the curse of dimensionality when data is supported on low-dimensional manifolds, remains theoretically une...
TabPFN-MT: A Natively Multitask In-Context Learner for Tabular Data
arXiv:2605.20234v1 Announce Type: new
Abstract: Prior-Data Fitted networks (PFNs) have been very successful in tabular contexts, handling prediction tasks in context. However, they are designed for single-task inference, meaning that predicting several target values within a context requires repeat...
GraphDiffMed: Knowledge-Constrained Differential Attention with Pharmacological Graph Priors for Medication Recommendation
arXiv:2605.20188v1 Announce Type: new
Abstract: Recommending safe and effective medication combinations from electronic health records (EHRs) is a core clinical AI problem, yet it remains difficult because patient trajectories are long, noisy, and clinically heterogeneous. Existing methods typicall...
Neural Estimation of Pairwise Mutual Information in Masked Discrete Sequence Models
arXiv:2605.20187v1 Announce Type: new
Abstract: Understanding dependencies between variables is critical for interpretability and efficient generation in masked diffusion models (MDMs), yet these models primarily expose marginal conditional distributions and do not explicitly represent inter-variab...
New Approach to Scaling Laws Could Change How AI Models Are Trained
Leveraging statistical concepts from measurement science and education, AI researchers have greatly reduced the computational demand of predicting how the largest of large language models will scale up in the future. It could save millions of dollars in training costs.
AgentNLQ: A General-Purpose Agent for Natural Language to SQL
arXiv:2605.19010v1 Announce Type: new
Abstract: Natural language to SQL (NL2SQL) conversion is an important problem for researchers and enterprises due to the ubiquitous importance of relational databases in broad-ranging practical problems. Despite the rapid advancements in the capabilities of LLM...
Learn-by-Wire Training Control Governance: Bounded Autonomous Training Under Stress for Stability and Efficiency
arXiv:2605.19008v1 Announce Type: new
Abstract: Modern language-model training is increasingly exposed to instability, degraded runs, and wasted compute, especially under aggressive learning-rate, scale, and runtime-stress conditions. This paper introduces Learn-by-Wire Guard (LBW-Guard), a bounded...
Evaluating the Utility of Personal Health Records in Personalized Health AI
arXiv:2605.18937v1 Announce Type: new
Abstract: Patient-managed Personal Health Records (PHRs) promises to empower patients to better understand their health; but information in the record is complex, potentially hindering insights. In this study, we assess the potential of large language models (L...
Operationalizing Document AI: A Microservice Architecture for OCR and LLM Pipelines in Production
arXiv:2605.18818v1 Announce Type: new
Abstract: Academic research tends to focus on new models for document understanding creating a wide gap in the literature between model definition and running models at production scale. To close that gap, we present a microservice architecture that encapsulate...
Position: Let's Develop Data Probes to Fundamentally Understand How Data Affects LLM Performance
arXiv:2605.18801v1 Announce Type: new
Abstract: Data is fundamental to large language models (LLMs). However, understanding of what makes certain data useful for different stages of an LLM workflow, including training, tuning, alignment, in-context learning, etc., and why, remains an open question....
Simply Stabilizing the Loop via Fully Looped Transformer
arXiv:2605.18797v1 Announce Type: new
Abstract: Scaling model performance typically requires increasing model size. Looped Transformer offers a compelling alternative by iteratively reusing the same Transformer blocks, trading additional computation for improved performance without increasing param...
UCCI: Calibrated Uncertainty for Cost-Optimal LLM Cascade Routing
arXiv:2605.18796v1 Announce Type: new
Abstract: LLM cascades and model routing promise lower inference cost by sending easy queries to a small model and escalating hard ones to a large model, but most deployed routers use uncalibrated confidence scores and require per-workload threshold tuning. We ...
HELLoRA: Hot Experts Layer-Level Low-Rank Adaptation for Mixture-of-Experts Models
arXiv:2605.18795v1 Announce Type: new
Abstract: Low-Rank Adaptation (LoRA) dominates parameter-efficient fine-tuning of large language models, yet most variants target dense architectures. Mixture-of-Experts (MoE) models scale parameters at near-constant per-token compute, and their sparse activati...
Robust Basis Spline Decoupling for the Compression of Transformer Models
arXiv:2605.18794v1 Announce Type: new
Abstract: Decoupling is a powerful modeling paradigm for representing multivariate functions as compositions of linear transformations and univariate nonlinear functions. A single-layer decoupling can be viewed as a fully connected neural network with a single ...
Dimensional Balance Improves Large Scale Spatiotemporal Prediction Performance
arXiv:2605.18793v1 Announce Type: new
Abstract: Accurate spatiotemporal pattern analysis is critical in fields such as urban traffic, meteorology, and public health monitoring. However, existing methods face performance bottlenecks, typically yielding only incremental gains and often exhibiting lim...
What if the model you've been evaluating has been evaluating you right back? New research finds that LLMs systematically alter their output depending on whether, and by whom, they believe they are being observed. It might have serious implications - are you ready?
Scalable Uncertainty Reasoning in Knowledge Graphs
arXiv:2605.16568v1 Announce Type: new
Abstract: Knowledge Graphs are pivotal for semantic data integration. The real-world data they model is often inherently uncertain. Within knowledge graphs, uncertainty manifests in three distinct levels: imprecise attribute values, probabilistic triple existen...
Skim: Speculative Execution for Fast and Efficient Web Agents
arXiv:2605.16565v2 Announce Type: new
Abstract: Skim is a speculative execution framework for web agents that exploits the predictable structure of purpose-built websites. Today's web-agent expense is not intrinsic to the tasks but a property of how agents are composed: frontier-model inference, br...
From Prompts to Protocols: An AI Agent for Laboratory Automation
arXiv:2605.16552v1 Announce Type: new
Abstract: Automating science laboratories enables faster, safer, more accurate, and more reproducible execution of protocols, accelerating the discovery and testing of new materials, drugs, and more. However, setting up and running autonomous labs requires coor...
ANNEAL: Adapting LLM Agents via Governed Symbolic Patch Learning
arXiv:2605.16309v1 Announce Type: new
Abstract: LLM-based agents can recover from individual execution errors, yet they repeatedly fail on the same fault when the underlying process knowledge--operator schemas, preconditions, and constraints--remains unrepaired. Existing self-evolving approaches ad...
AgentWall: A Runtime Safety Layer for Local AI Agents
arXiv:2605.16265v1 Announce Type: new
Abstract: The safety of autonomous AI agents is increasingly recognized as a critical open problem. As agents transition from passive text generators to active actors capable of executing shell commands, modifying files, calling APIs, and browsing the web, the ...
When Actions Disappear: Adversarial Action Removal in Self-Play Reinforcement Learning
arXiv:2605.16312v1 Announce Type: new
Abstract: We study adversarial action masking in self-play reinforcement learning: an attacker selectively removes legal actions from a victim's action set. Unlike observation or action perturbations, removal eliminates decision options before the agent acts. A...