DLLM-Searcher: Adapting Diffusion Large Language Model for Search Agents
arXiv:2602.07035v1 Announce Type: new
Abstract: Recently, Diffusion Large Language Models (dLLMs) have demonstrated unique efficiency advantages, enabled by their inherently parallel decoding mechanism and flexible generation paradigm. Meanwhile, despite the rapid advancement of Search Agents, thei...
Theory of Space: Can Foundation Models Construct Spatial Beliefs through Active Exploration?
arXiv:2602.07055v1 Announce Type: new
Abstract: Spatial embodied intelligence requires agents to act to acquire information under partial observability. While multimodal foundation models excel at passive perception, their capacity for active, self-directed exploration remains understudied. We prop...
Aster: Autonomous Scientific Discovery over 20x Faster Than Existing Methods
arXiv:2602.07040v1 Announce Type: new
Abstract: We introduce Aster, an AI agent for autonomous scientific discovery capable of operating over 20 times faster than existing frameworks. Given a task, an initial program, and a script to evaluate the performance of the program, Aster iteratively improv...
ST-Raptor: An Agentic System for Semi-Structured Table QA
arXiv:2602.07034v1 Announce Type: new
Abstract: Semi-structured table question answering (QA) is a challenging task that requires (1) precise extraction of cell contents and positions and (2) accurate recovery of key implicit logical structures, hierarchical relationships, and semantic associations...
Toward Faithful and Complete Answer Construction from a Single Document
arXiv:2602.06103v1 Announce Type: new
Abstract: Modern large language models (LLMs) are powerful generators driven by statistical next-token prediction. While effective at producing fluent text, this design biases models toward high-probability continuations rather than exhaustive and faithful answ...
Pragmatic Curiosity: A Hybrid Learning-Optimization Paradigm via Active Inference
arXiv:2602.06104v1 Announce Type: new
Abstract: Many engineering and scientific workflows depend on expensive black-box evaluations, requiring decision-making that simultaneously improves performance and reduces uncertainty. Bayesian optimization (BO) and Bayesian experimental design (BED) offer po...
Private and interpretable clinical prediction with quantum-inspired tensor train models
arXiv:2602.06110v1 Announce Type: new
Abstract: Machine learning in clinical settings must balance predictive accuracy, interpretability, and privacy. Models such as logistic regression (LR) offer transparency, while neural networks (NNs) provide greater predictive power; yet both remain vulnerable...
Agentic Workflow Using RBA$_\theta$ for Event Prediction
arXiv:2602.06097v1 Announce Type: new
Abstract: Wind power ramp events are difficult to forecast due to strong variability, multi-scale dynamics, and site-specific meteorological effects. This paper proposes an event-first, frequency-aware forecasting paradigm that directly predicts ramp events and...
NanoNet: Parameter-Efficient Learning with Label-Scarce Supervision for Lightweight Text Mining Model
arXiv:2602.06093v1 Announce Type: new
Abstract: The lightweight semi-supervised learning (LSL) strategy provides an effective approach of conserving labeled samples and minimizing model inference costs. Prior research has effectively applied knowledge transfer learning and co-training regularizatio...
arXiv:2602.06176v1 Announce Type: new
Abstract: Large Language Models (LLMs) have exhibited remarkable reasoning capabilities, achieving impressive results across a wide range of tasks. Despite these advances, significant reasoning failures persist, occurring even in seemingly simple scenarios. To ...
Exposing Weaknesses of Large Reasoning Models through Graph Algorithm Problems
arXiv:2602.06319v1 Announce Type: new
Abstract: Large Reasoning Models (LRMs) have advanced rapidly; however, existing benchmarks in mathematics, code, and common-sense reasoning remain limited. They lack long-context evaluation, offer insufficient challenge, and provide answers that are difficult ...
Do LLMs Act Like Rational Agents? Measuring Belief Coherence in Probabilistic Decision Making
arXiv:2602.06286v1 Announce Type: new
Abstract: Large language models (LLMs) are increasingly deployed as agents in high-stakes domains where optimal actions depend on both uncertainty about the world and consideration of utilities of different outcomes, yet their decision logic remains difficult t...
Do It for HER: First-Order Temporal Logic Reward Specification in Reinforcement Learning (Extended Version)
arXiv:2602.06227v1 Announce Type: new
Abstract: In this work, we propose a novel framework for the logical specification of non-Markovian rewards in Markov Decision Processes (MDPs) with large state spaces. Our approach leverages Linear Temporal Logic Modulo Theories over finite traces (LTLfMT), a ...
arXiv:2602.06107v1 Announce Type: new
Abstract: Reinforcement learning (RL) for large language models (LLMs) remains expensive, particularly because the rollout is expensive. Decoupling rollout generation from policy optimization (e.g., leveraging a more efficient model to rollout) could enable sub...
Towards Reducible Uncertainty Modeling for Reliable Large Language Model Agents
arXiv:2602.05073v1 Announce Type: new
Abstract: Uncertainty quantification (UQ) for large language models (LLMs) is a key building block for safety guardrails of daily LLM applications. Yet, even as LLM agents are increasingly deployed in highly complex tasks, most UQ research still centers on sing...
Denoising diffusion networks for normative modeling in neuroimaging
arXiv:2602.04886v1 Announce Type: new
Abstract: Normative modeling estimates reference distributions of biological measures conditional on covariates, enabling centiles and clinically interpretable deviation scores to be derived. Most neuroimaging pipelines fit one model per imaging-derived phenoty...
DCER: Dual-Stage Compression and Energy-Based Reconstruction
arXiv:2602.04904v1 Announce Type: new
Abstract: Multimodal fusion faces two robustness challenges: noisy inputs degrade representation quality, and missing modalities cause prediction failures. We propose DCER, a
unified framework addressing both challenges through dual-stage compression and ener...
Momentum Attention: The Physics of In-Context Learning and Spectral Forensics for Mechanistic Interpretability
arXiv:2602.04902v1 Announce Type: new
Abstract: The Mechanistic Interpretability (MI) program has mapped the Transformer as a precise computational graph. We extend this graph with a conservation law and time-varying AC dynamics, viewing it as a physical circuit. We introduce Momentum Attention, a ...
Mind the Performance Gap: Capability-Behavior Trade-offs in Feature Steering
arXiv:2602.04903v1 Announce Type: new
Abstract: Feature steering has emerged as a promising approach for controlling LLM behavior through direct manipulation of internal representations, offering advantages over prompt engineering. However, its practical effectiveness in real-world applications rem...
Evaluating Large Language Models on Solved and Unsolved Problems in Graph Theory: Implications for Computing Education
arXiv:2602.05059v1 Announce Type: new
Abstract: Large Language Models are increasingly used by students to explore advanced material in computer science, including graph theory. As these tools become integrated into undergraduate and graduate coursework, it is important to understand how reliably t...
A Causal Perspective for Enhancing Jailbreak Attack and Defense
arXiv:2602.04893v1 Announce Type: new
Abstract: Uncovering the mechanisms behind "jailbreaks" in large language models (LLMs) is crucial for enhancing their safety and reliability, yet these mechanisms remain poorly understood. Existing studies predominantly analyze jailbreak prompts by probing lat...
Artificial Intelligence as Strange Intelligence: Against Linear Models of Intelligence
arXiv:2602.04986v1 Announce Type: new
Abstract: We endorse and expand upon Susan Schneider's critique of the linear model of AI progress and introduce two novel concepts: "familiar intelligence" and "strange intelligence". AI intelligence is likely to be strange intelligence, defying familiar patte...
DeepRead: Document Structure-Aware Reasoning to Enhance Agentic Search
arXiv:2602.05014v1 Announce Type: new
Abstract: With the rapid progress of tool-using and agentic large language models (LLMs), Retrieval-Augmented Generation (RAG) is evolving from one-shot, passive retrieval into multi-turn, decision-driven evidence acquisition. Despite strong results in open-dom...
Adaptive Test-Time Compute Allocation via Learned Heuristics over Categorical Structure
arXiv:2602.03975v1 Announce Type: new
Abstract: Test-time computation has become a primary driver of progress in large language model (LLM) reasoning, but it is increasingly bottlenecked by expensive verification. In many reasoning systems, a large fraction of verifier calls are spent on redundant ...