Do It for HER: First-Order Temporal Logic Reward Specification in Reinforcement Learning (Extended Version)
arXiv:2602.06227v1 Announce Type: new
Abstract: In this work, we propose a novel framework for the logical specification of non-Markovian rewards in Markov Decision Processes (MDPs) with large state spaces. Our approach leverages Linear Temporal Logic Modulo Theories over finite traces (LTLfMT), a ...
Denoising diffusion networks for normative modeling in neuroimaging
arXiv:2602.04886v1 Announce Type: new
Abstract: Normative modeling estimates reference distributions of biological measures conditional on covariates, enabling centiles and clinically interpretable deviation scores to be derived. Most neuroimaging pipelines fit one model per imaging-derived phenoty...
DCER: Dual-Stage Compression and Energy-Based Reconstruction
arXiv:2602.04904v1 Announce Type: new
Abstract: Multimodal fusion faces two robustness challenges: noisy inputs degrade representation quality, and missing modalities cause prediction failures. We propose DCER, a
unified framework addressing both challenges through dual-stage compression and ener...
Momentum Attention: The Physics of In-Context Learning and Spectral Forensics for Mechanistic Interpretability
arXiv:2602.04902v1 Announce Type: new
Abstract: The Mechanistic Interpretability (MI) program has mapped the Transformer as a precise computational graph. We extend this graph with a conservation law and time-varying AC dynamics, viewing it as a physical circuit. We introduce Momentum Attention, a ...
Mind the Performance Gap: Capability-Behavior Trade-offs in Feature Steering
arXiv:2602.04903v1 Announce Type: new
Abstract: Feature steering has emerged as a promising approach for controlling LLM behavior through direct manipulation of internal representations, offering advantages over prompt engineering. However, its practical effectiveness in real-world applications rem...
A Causal Perspective for Enhancing Jailbreak Attack and Defense
arXiv:2602.04893v1 Announce Type: new
Abstract: Uncovering the mechanisms behind "jailbreaks" in large language models (LLMs) is crucial for enhancing their safety and reliability, yet these mechanisms remain poorly understood. Existing studies predominantly analyze jailbreak prompts by probing lat...
Towards Reducible Uncertainty Modeling for Reliable Large Language Model Agents
arXiv:2602.05073v1 Announce Type: new
Abstract: Uncertainty quantification (UQ) for large language models (LLMs) is a key building block for safety guardrails of daily LLM applications. Yet, even as LLM agents are increasingly deployed in highly complex tasks, most UQ research still centers on sing...
Evaluating Large Language Models on Solved and Unsolved Problems in Graph Theory: Implications for Computing Education
arXiv:2602.05059v1 Announce Type: new
Abstract: Large Language Models are increasingly used by students to explore advanced material in computer science, including graph theory. As these tools become integrated into undergraduate and graduate coursework, it is important to understand how reliably t...
DeepRead: Document Structure-Aware Reasoning to Enhance Agentic Search
arXiv:2602.05014v1 Announce Type: new
Abstract: With the rapid progress of tool-using and agentic large language models (LLMs), Retrieval-Augmented Generation (RAG) is evolving from one-shot, passive retrieval into multi-turn, decision-driven evidence acquisition. Despite strong results in open-dom...
Artificial Intelligence as Strange Intelligence: Against Linear Models of Intelligence
arXiv:2602.04986v1 Announce Type: new
Abstract: We endorse and expand upon Susan Schneider's critique of the linear model of AI progress and introduce two novel concepts: "familiar intelligence" and "strange intelligence". AI intelligence is likely to be strange intelligence, defying familiar patte...
GeoIB: Geometry-Aware Information Bottleneck via Statistical-Manifold Compression
arXiv:2602.03906v1 Announce Type: new
Abstract: Information Bottleneck (IB) is widely used, but in deep learning, it is usually implemented through tractable surrogates, such as variational bounds or neural mutual information (MI) estimators, rather than directly controlling the MI I(X;Z) itself. T...
arXiv:2602.03876v1 Announce Type: new
Abstract: Standard reinforcement learning from human feedback (RLHF) trains a reward model on pairwise preference data and then uses it for policy optimization. However, while reward models are optimized to capture relative preferences, existing policy optimiza...
NeuroPareto: Calibrated Acquisition for Costly Many-Goal Search in Vast Parameter Spaces
arXiv:2602.03901v1 Announce Type: new
Abstract: The pursuit of optimal trade-offs in high-dimensional search spaces under stringent computational constraints poses a fundamental challenge for contemporary multi-objective optimization. We develop NeuroPareto, a cohesive architecture that integrates ...
Understanding the Impact of Differentially Private Training on Memorization of Long-Tailed Data
arXiv:2602.03872v1 Announce Type: new
Abstract: Recent research shows that modern deep learning models achieve high predictive accuracy partly by memorizing individual training samples. Such memorization raises serious privacy concerns, motivating the widespread adoption of differentially private t...
Reversible Deep Learning for 13C NMR in Chemoinformatics: On Structures and Spectra
arXiv:2602.03875v1 Announce Type: new
Abstract: We introduce a reversible deep learning model for 13C NMR that uses a single conditional invertible neural network for both directions between molecular structures and spectra. The network is built from i-RevNet style bijective blocks, so the forward ...
Adaptive Test-Time Compute Allocation via Learned Heuristics over Categorical Structure
arXiv:2602.03975v1 Announce Type: new
Abstract: Test-time computation has become a primary driver of progress in large language model (LLM) reasoning, but it is increasingly bottlenecked by expensive verification. In many reasoning systems, a large fraction of verifier calls are spent on redundant ...
AgentArk: Distilling Multi-Agent Intelligence into a Single LLM Agent
arXiv:2602.03955v1 Announce Type: new
Abstract: While large language model (LLM) multi-agent systems achieve superior reasoning performance through iterative debate, practical deployment is limited by their high computational cost and error propagation. This paper proposes AgentArk, a novel framewo...
Knowledge Model Prompting Increases LLM Performance on Planning Tasks
arXiv:2602.03900v1 Announce Type: new
Abstract: Large Language Models (LLM) can struggle with reasoning ability and planning tasks. Many prompting techniques have been developed to assist with LLM reasoning, notably Chain-of-Thought (CoT); however, these techniques, too, have come under scrutiny as...
Enhancing Mathematical Problem Solving in LLMs through Execution-Driven Reasoning Augmentation
arXiv:2602.03950v1 Announce Type: new
Abstract: Mathematical problem solving is a fundamental benchmark for assessing the reasoning capabilities of artificial intelligence and a gateway to applications in education, science, and engineering where reliable symbolic reasoning is essential. Although r...
Augmenting Parameter-Efficient Pre-trained Language Models with Large Language Models
arXiv:2602.02501v1 Announce Type: new
Abstract: Training AI models in cybersecurity with help of vast datasets offers significant opportunities to mimic real-world behaviors effectively. However, challenges like data drift and scarcity of labelled data lead to frequent updates of models and the ris...
What Drives Length of Stay After Elective Spine Surgery? Insights from a Decade of Predictive Modeling
arXiv:2602.02517v1 Announce Type: new
Abstract: Objective: Predicting length of stay after elective spine surgery is essential for optimizing patient outcomes and hospital resource use. This systematic review synthesizes computational methods used to predict length of stay in this patient populatio...
Uncertainty and Fairness Awareness in LLM-Based Recommendation Systems
arXiv:2602.02582v1 Announce Type: new
Abstract: Large language models (LLMs) enable powerful zero-shot recommendations by leveraging broad contextual knowledge, yet predictive uncertainty and embedded biases threaten reliability and fairness. This paper studies how uncertainty and fairness evaluati...
A Positive Case for Faithfulness: LLM Self-Explanations Help Predict Model Behavior
arXiv:2602.02639v1 Announce Type: new
Abstract: LLM self-explanations are often presented as a promising tool for AI oversight, yet their faithfulness to the model's true reasoning process is poorly understood. Existing faithfulness metrics have critical limitations, typically relying on identifyin...
Experience-Driven Multi-Agent Systems Are Training-free Context-aware Earth Observers
arXiv:2602.02559v1 Announce Type: new
Abstract: Recent advances have enabled large language model (LLM) agents to solve complex tasks by orchestrating external tools. However, these agents often struggle in specialized, tool-intensive domains that demand long-horizon execution, tight coordination a...