arXiv:2603.12710v1 Announce Type: new
Abstract: Developing autonomous agents for web-based tasks is a core challenge in AI. While Large Language Model (LLM) agents can interpret complex user requests, they often operate as black boxes, making it difficult to diagnose why they fail or how they plan....
From Garbage to Gold: A Data-Architectural Theory of Predictive Robustness
arXiv:2603.12288v1 Announce Type: new
Abstract: Tabular machine learning presents a paradox: modern models achieve state-of-the-art performance using high-dimensional (high-D), collinear, error-prone data, defying the "Garbage In, Garbage Out" mantra. To help resolve this, we synthesize principles ...
Global Evolutionary Steering: Refining Activation Steering Control via Cross-Layer Consistency
arXiv:2603.12298v1 Announce Type: new
Abstract: Activation engineering enables precise control over Large Language Models (LLMs) without the computational cost of fine-tuning. However, existing methods deriving vectors from static activation differences are susceptible to high-dimensional noise and...
Multi-objective Genetic Programming with Multi-view Multi-level Feature for Enhanced Protein Secondary Structure Prediction
arXiv:2603.12293v1 Announce Type: new
Abstract: Predicting protein secondary structure is essential for understanding protein function and advancing drug discovery. However, the intricate sequence-structure relationship poses significant challenges for accurate modeling. To address these, we propos...
Reversible Lifelong Model Editing via Semantic Routing-Based LoRA
arXiv:2603.11239v1 Announce Type: new
Abstract: The dynamic evolution of real-world necessitates model editing within Large Language Models. While existing methods explore modular isolation or parameter-efficient strategies, they still suffer from semantic drift or knowledge forgetting due to conti...
A Survey of Reasoning in Autonomous Driving Systems: Open Challenges and Emerging Paradigms
arXiv:2603.11093v1 Announce Type: new
Abstract: The development of high-level autonomous driving (AD) is shifting from perception-centric limitations to a more fundamental bottleneck, namely, a deficit in robust and generalizable reasoning. Although current AD systems manage structured environments...
Measuring AI Agents' Progress on Multi-Step Cyber Attack Scenarios
arXiv:2603.11214v1 Announce Type: new
Abstract: We evaluate the autonomous cyber-attack capabilities of frontier AI models on two purpose-built cyber ranges-a 32-step corporate network attack and a 7-step industrial control system attack-that require chaining heterogeneous capabilities across exten...
DIVE: Scaling Diversity in Agentic Task Synthesis for Generalizable Tool Use
arXiv:2603.11076v1 Announce Type: new
Abstract: Recent work synthesizes agentic tasks for post-training tool-using LLMs, yet robust generalization under shifts in tasks and toolsets remains an open challenge. We trace this brittleness to insufficient diversity in synthesized tasks. Scaling diversit...
PACED: Distillation at the Frontier of Student Competence
arXiv:2603.11178v1 Announce Type: new
Abstract: Standard LLM distillation wastes compute on two fronts: problems the student has already mastered (near-zero gradients) and problems far beyond its reach (incoherent gradients that erode existing capabilities). We show that this waste is not merely in...
Interventional Time Series Priors for Causal Foundation Models
arXiv:2603.11090v1 Announce Type: new
Abstract: Prior-data fitted networks (PFNs) have emerged as powerful foundation models for tabular causal inference, yet their extension to time series remains limited by the absence of synthetic data generators that provide interventional targets. Existing tim...
Graph Tokenization for Bridging Graphs and Transformers
arXiv:2603.11099v1 Announce Type: new
Abstract: The success of large pretrained Transformers is closely tied to tokenizers, which convert raw input into discrete symbols. Extending these models to graph-structured data remains a significant challenge. In this work, we introduce a graph tokenization...
Agentic Control Center for Data Product Optimization
arXiv:2603.10133v1 Announce Type: new
Abstract: Data products enable end users to gain greater insights about their data by providing supporting assets, such as example question-SQL pairs which can be answered using the data or views over the database tables. However, producing useful data products...
Hybrid Self-evolving Structured Memory for GUI Agents
arXiv:2603.10291v1 Announce Type: new
Abstract: The remarkable progress of vision-language models (VLMs) has enabled GUI agents to interact with computers in a human-like manner. Yet real-world computer-use tasks remain difficult due to long-horizon workflows, diverse interfaces, and frequent inter...
Verbalizing LLM's Higher-order Uncertainty via Imprecise Probabilities
arXiv:2603.10396v1 Announce Type: new
Abstract: Despite the growing demand for eliciting uncertainty from large language models (LLMs), empirical evidence suggests that LLM behavior is not always adequately captured by the elicitation techniques developed under the classical probabilistic uncertain...
Beyond Scalars: Evaluating and Understanding LLM Reasoning via Geometric Progress and Stability
arXiv:2603.10384v1 Announce Type: new
Abstract: Evaluating LLM reliability via scalar probabilities often fails to capture the structural dynamics of reasoning. We introduce TRACED, a framework that assesses reasoning quality through theoretically grounded geometric kinematics. By decomposing reaso...
HEAL: Hindsight Entropy-Assisted Learning for Reasoning Distillation
arXiv:2603.10359v1 Announce Type: new
Abstract: Distilling reasoning capabilities from Large Reasoning Models (LRMs) into smaller models is typically constrained by the limitation of rejection sampling. Standard methods treat the teacher as a static filter, discarding complex "corner-case" problems...
MoE-SpAc: Efficient MoE Inference Based on Speculative Activation Utility in Heterogeneous Edge Scenarios
arXiv:2603.09983v1 Announce Type: new
Abstract: Mixture-of-Experts (MoE) models enable scalable performance but face severe memory constraints on edge devices. Existing offloading strategies struggle with I/O bottlenecks due to the dynamic, low-information nature of autoregressive expert activation...
Personalized Group Relative Policy Optimization for Heterogenous Preference Alignment
arXiv:2603.10009v1 Announce Type: new
Abstract: Despite their sophisticated general-purpose capabilities, Large Language Models (LLMs) often fail to align with diverse individual preferences because standard post-training methods, like Reinforcement Learning with Human Feedback (RLHF), optimize for...
arXiv:2603.09980v1 Announce Type: new
Abstract: LLM unlearning is essential for mitigating safety, copyright, and privacy concerns in pre-trained large language models (LLMs). Compared to preference alignment, it offers a more explicit way by removing undesirable knowledge characterized by specific...
LWM-Temporal: Sparse Spatio-Temporal Attention for Wireless Channel Representation Learning
arXiv:2603.10024v1 Announce Type: new
Abstract: LWM-Temporal is a new member of the Large Wireless Models (LWM) family that targets the spatiotemporal nature of wireless channels. Designed as a task-agnostic foundation model, LWM-Temporal learns universal channel embeddings that capture mobility-in...
SPREAD: Subspace Representation Distillation for Lifelong Imitation Learning
arXiv:2603.08763v1 Announce Type: new
Abstract: A key challenge in lifelong imitation learning (LIL) is enabling agents to acquire new skills from expert demonstrations while retaining prior knowledge. This requires preserving the low-dimensional manifolds and geometric structures that underlie tas...
Generalized Reduction to the Isotropy for Flexible Equivariant Neural Fields
arXiv:2603.08758v1 Announce Type: new
Abstract: Many geometric learning problems require invariants on heterogeneous product spaces, i.e., products of distinct spaces carrying different group actions, where standard techniques do not directly apply. We show that, when a group $G$ acts transitively ...
Hindsight Credit Assignment for Long-Horizon LLM Agents
arXiv:2603.08754v1 Announce Type: new
Abstract: Large Language Model (LLM) agents often face significant credit assignment challenges in long-horizon, multi-step tasks due to sparse rewards. Existing value-free methods, such as Group Relative Policy Optimization (GRPO), encounter two fundamental bo...
arXiv:2603.08717v1 Announce Type: new
Abstract: AI-enabled Radio Access Networks (AI-RANs) are expected to serve heterogeneous users with time-varying learning tasks over shared edge resources. Ensuring equitable inference performance across these users requires adaptive and fair learning mechanism...