BALAR : A Bayesian Agentic Loop for Active Reasoning
arXiv:2605.05386v1 Announce Type: new
Abstract: Large language models increasingly operate in interactive settings where solving a task requires multiple rounds of information exchange with a user. However, most current systems treat dialogue reactively and lack a principled mechanism to reason abo...
Partial Evidence Bench: Benchmarking Authorization-Limited Evidence in Agentic Systems
arXiv:2605.05379v1 Announce Type: new
Abstract: Enterprise agents increasingly operate inside scoped retrieval systems, delegated workflows, and policy-constrained evidence environments. In these settings, access control can be enforced correctly while the system still produces an answer that appea...
Intelligent CCTV for Urban Design: AI-Based Analysis of Soft Infrastructure at Intersections
arXiv:2605.05402v1 Announce Type: new
Abstract: Artificial intelligence (AI) and computer vision are transforming transportation data collection. This study introduces an AI-enabled analytics framework leveraging existing CCTV infrastructure to evaluate the impact of soft interventions, such as tem...
Nationwide EHR-Based Chronic Rhinosinusitis Prediction Using Demographic-Stratified Models
arXiv:2605.05213v1 Announce Type: new
Abstract: Chronic rhinosinusitis (CRS) is a common heterogeneous inflammatory disorder that causes substantial morbidity and healthcare costs. CRS is difficult to identify early from routine encounters, as symptom presentations overlap with common conditions su...
SAT: Sequential Agent Tuning for Coordinator Free Plug and Play Multi-LLM Training with Monotonic Improvement Guarantees
arXiv:2605.05216v1 Announce Type: new
Abstract: Large language models (LLMs) with a large number of parameters achieve strong performance but are often prohibitively expensive to deploy. Recent work explores using teams of smaller, more efficient LLMs that collectively match or even outperform a si...
Physics-Informed Neural Networks with Learnable Loss Balancing and Transfer Learning
arXiv:2605.05217v1 Announce Type: new
Abstract: We propose a self-supervised physics-informed neural network (PINN) framework that adaptively balances physics-based and data-driven supervision for scientific machine learning under data scarcity. Unlike prior PINNs that rely on fixed or heuristic we...
MP-ISMoE: Mixed-Precision Interactive Side Mixture-of-Experts for Efficient Transfer Learning
arXiv:2605.04058v1 Announce Type: new
Abstract: Parameter-efficient transfer learning (PETL) has emerged as a pivotal paradigm for adapting pre-trained foundation models to downstream tasks, significantly reducing trainable parameters yet suffering from substantial memory overhead caused by gradien...
Structured Progressive Knowledge Activation for LLM-Driven Neural Architecture Search
arXiv:2605.04057v1 Announce Type: new
Abstract: This paper focuses on a key challenge in Neural Architecture Search (NAS): integrating established architectural knowledge while exploring new designs under expensive evaluations. Large language models (LLMs) are a promising assistant for NAS because ...
Transformation Categorization Based on Group Decomposition Theory Using Parameter Division
arXiv:2605.04056v1 Announce Type: new
Abstract: Representation learning seeks meaningful sensory representations without supervision and can model aspects of human development. Although many neural networks empirically learn useful features, a principled account of what makes a representation "good...
Endogenous Regime Switching Driven by Scalar-Irreducible Learning Dynamics
arXiv:2605.04054v1 Announce Type: new
Abstract: Achieving endogenous regime switching is crucial for the emergence of autonomous intelligence, yet remains a central challenge for existing machine learning frameworks, where such transitions are typically externally imposed. In this work, we introduc...
Actionable Real-Time Modeling of Surgical Team Dynamics via Time-Expanded Interaction Graphs
arXiv:2605.04169v1 Announce Type: new
Abstract: Surgical team performance arises from complex interactions between technical execution and non-technical skills, including communication and coordination dynamics. However, current surgical AI systems predominantly model visual workflow signals, lacki...
Pro$^2$Assist: Continuous Step-Aware Proactive Assistance with Multimodal Egocentric Perception for Long-Horizon Procedural Tasks
arXiv:2605.04227v1 Announce Type: new
Abstract: Procedural tasks with multiple ordered steps are ubiquitous in daily life. Recent advances in multimodal large language models (MLLMs) have enabled personal assistants that support daily activities. However, existing systems primarily provide reactive...
arXiv:2605.04050v1 Announce Type: new
Abstract: We introduce Lossless Context Management (LCM), a deterministic architecture for LLM memory that outperforms Claude Code on long-context tasks. When benchmarked using Opus 4.6, our LCM-augmented coding agent, Volt, achieves higher scores than Claude C...
Delay, Plateau, or Collapse: Evaluating the Impact of Systematic Verification Error on RLVR
arXiv:2605.02909v1 Announce Type: new
Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has become a powerful approach for improving the reasoning capabilities of large language models (LLMs). While RLVR is designed for tasks with verifiable ground-truth answers, real-world verifiers ...
An End-to-End Framework for Building Large Language Models for Software Operations
arXiv:2605.02906v1 Announce Type: new
Abstract: In the field of software operations, Large Language Models (LLMs) have attracted increasing attention. However, existing research has not yet achieved efficient and effective end-to-end intelligent operations due to low-quality data, fragmented knowle...
StateSMix: Online Lossless Compression via Mamba State Space Models and Sparse N-gram Context Mixing
arXiv:2605.02904v1 Announce Type: new
Abstract: We present StateSMix, a fully self-contained lossless compressor that couples an online-trained Mamba-style State Space Model (SSM) with sparse n-gram context mixing and arithmetic coding. The model is initialised from scratch and trained token-by-tok...
eOptShrinkQ: Near-Lossless KV Cache Compression Through Optimal Spectral Denoising and Quantization
arXiv:2605.02905v1 Announce Type: new
Abstract: We show that the key-value (KV) cache in transformer attention heads admits a natural decomposition into a low-rank \emph{shared context} component and a full-rank \emph{per-token} residual, well described by the spiked random matrix model. This obser...
CreativityBench: Evaluating Agent Creative Reasoning via Affordance-Based Tool Repurposing
arXiv:2605.02910v2 Announce Type: new
Abstract: Recent advances in large language models have led to strong performance on reasoning and environment-interaction tasks, yet their ability for creative problem-solving remains underexplored. We study this capability through the lens of creative tool us...
Stable Agentic Control: Tool-Mediated LLM Architecture for Autonomous Cyber Defense
arXiv:2605.03034v1 Announce Type: new
Abstract: Agentic systems involved in high-stake decision-making under adversarial pressure need formal guarantees not offered by existing approaches. Motivated by the operational needs of security operations centers (SOCs) that must configure endpoint detectio...
Programmatic Context Augmentation for LLM-based Symbolic Regression
arXiv:2605.03101v1 Announce Type: new
Abstract: Symbolic regression (SR), the task of discovering mathematical expressions that best describe a given dataset, remains a fundamental challenge in scientific discovery. Traditional approaches, primarily based on genetic algorithms and related evolution...
Fast Log-Domain Sinkhorn Optimal Transport with Warp-Level GPU Reductions
arXiv:2605.00837v1 Announce Type: new
Abstract: Entropic regularized optimal transport (OT) via the Sinkhorn algorithm has become a fundamental tool in machine learning, yet existing implementations either suffer from numerical instability for small regularization parameters or incur significant ov...
Agentopic: A Generative AI Agent Workflow for Explainable Topic Modeling
arXiv:2605.00833v1 Announce Type: new
Abstract: Agentopic is a novel agent-based workflow for explainable topic modeling that leverages the reasoning capabilities of Large Language Models (LLMs). Existing topic modeling approaches such as Latent Dirichlet Allocation (LDA) and BERTopic often lack tr...
Polynomial-Time Optimal Group Selection via the Double-Commutator Eigenvalue Problem
arXiv:2605.00834v1 Announce Type: new
Abstract: The algebraic diversity framework replaces temporal averaging over multiple observations with algebraic group action on a single observation for second-order statistical estimation. The central open problem in this framework is $\textit{group selectio...