TIME: Temporally Intelligent Meta-reasoning Engine for Context Triggered Explicit Reasoning
arXiv:2601.05300v1 Announce Type: new
Abstract: Reasoning oriented large language models often expose explicit "thinking" as long, turn-global traces at the start of every response, either always on or toggled externally at inference time. While useful for arithmetic, programming, and problem solvi...
Ontology Neural Networks for Topologically Conditioned Constraint Satisfaction
arXiv:2601.05304v1 Announce Type: new
Abstract: Neuro-symbolic reasoning systems face fundamental challenges in maintaining semantic coherence while satisfying physical and logical constraints. Building upon our previous work on Ontology Neural Networks, we present an enhanced framework that integr...
When the Server Steps In: Calibrated Updates for Fair Federated Learning
arXiv:2601.05352v1 Announce Type: new
Abstract: Federated learning (FL) has emerged as a transformative distributed learning paradigm, enabling multiple clients to collaboratively train a global model under the coordination of a central server without sharing their raw training data. While FL offer...
GlyRAG: Context-Aware Retrieval-Augmented Framework for Blood Glucose Forecasting
arXiv:2601.05353v1 Announce Type: new
Abstract: Accurate forecasting of blood glucose from CGM is essential for preventing dysglycemic events, thus enabling proactive diabetes management. However, current forecasting models treat blood glucose readings captured using CGMs as a numerical sequence, e...
Naiad: Novel Agentic Intelligent Autonomous System for Inland Water Monitoring
arXiv:2601.05256v1 Announce Type: new
Abstract: Inland water monitoring is vital for safeguarding public health and ecosystems, enabling timely interventions to mitigate risks. Existing methods often address isolated sub-problems such as cyanobacteria, chlorophyll, or other quality indicators separ...
Mathematical Knowledge Graph-Driven Framework for Equation-Based Predictive and Reliable Additive Manufacturing
arXiv:2601.05298v1 Announce Type: new
Abstract: Additive manufacturing (AM) relies critically on understanding and extrapolating process-property relationships; however, existing data-driven approaches remain limited by fragmented knowledge representations and unreliable extrapolation under sparse ...
Effects of personality steering on cooperative behavior in Large Language Model agents
arXiv:2601.05302v1 Announce Type: new
Abstract: Large language models (LLMs) are increasingly used as autonomous agents in strategic and social interactions. Although recent studies suggest that assigning personality traits to LLMs can influence their behavior, how personality steering affects coop...
Improving Enzyme Prediction with Chemical Reaction Equations by Hypergraph-Enhanced Knowledge Graph Embeddings
arXiv:2601.05330v1 Announce Type: new
Abstract: Predicting enzyme-substrate interactions has long been a fundamental problem in biochemistry and metabolic engineering. While existing methods could leverage databases of expert-curated enzyme-substrate pairs for models to learn from known pair intera...
The Persona Paradox: Medical Personas as Behavioral Priors in Clinical Language Models
arXiv:2601.05376v1 Announce Type: new
Abstract: Persona conditioning can be viewed as a behavioral prior for large language models (LLMs) and is often assumed to confer expertise and improve safety in a monotonic manner. However, its effects on high-stakes clinical decision-making remain poorly cha...
MoEs Are Stronger than You Think: Hyper-Parallel Inference Scaling with RoE
The generation quality of large language models (LLMs) is often improved by utilizing inference-time sequence-level scaling methods (e.g., Chain-of-Thought). We introduce hyper-parallel scaling, a complementary framework that improves prediction quality at the token level. Hyper-parallel scaling com...
Multivariate Conformal Prediction using Optimal Transport
Conformal prediction (CP) quantifies the uncertainty of machine learning models by constructing sets of plausible outputs. These sets are constructed by leveraging a so-called conformity score, a quantity computed using the input point of interest, a prediction model, and past observations. CP sets ...
DeepMMSearch-R1: Empowering Multimodal LLMs in Multimodal Web Search
Multimodal Large Language Models (MLLMs) in real-world applications require access to external knowledge sources and must remain responsive to the dynamic and ever-changing real-world information in order to address information-seeking and knowledge-intensive user queries. Existing approaches, such ...
Over-Searching in Search-Augmented Large Language Models
Search-augmented large language models (LLMs) excel at knowledge-intensive tasks by integrating external retrieval.
However, they often over-search – unnecessarily invoking search tool even when it does not improve response quality,
which leads to computational inefficiency and hallucinations by inc...
MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer
Unified multimodal Large Language Models (LLMs) that can both understand and generate visual content hold immense potential. However, existing open-source models often suffer from a performance trade-off between these capabilities. We present Manzano, a simple and scalable unified framework that sub...
An encoder (optical system) maps objects to noiseless images, which noise corrupts into measurements. Our information estimator uses only these noisy measurements and a noise model to quantify how well measurements distinguish objects.
Many imaging systems produce measurements that humans never se...
The Forgotten Shield: Safety Grafting in Parameter-Space for Medical MLLMs
arXiv:2601.04199v1 Announce Type: new
Abstract: Medical Multimodal Large Language Models (Medical MLLMs) have achieved remarkable progress in specialized medical tasks; however, research into their safety has lagged, posing potential risks for real-world deployment. In this paper, we first establis...
Green MLOps: Closed-Loop, Energy-Aware Inference with NVIDIA Triton, FastAPI, and Bio-Inspired Thresholding
arXiv:2601.04250v1 Announce Type: new
Abstract: Energy efficiency is a first-order concern in AI deployment, as long-running inference can exceed training in cumulative carbon impact. We propose a bio-inspired framework that maps protein-folding energy basins to inference cost landscapes and contro...
Safety-Utility Conflicts Are Not Global: Surgical Alignment via Head-Level Diagnosis
arXiv:2601.04262v1 Announce Type: new
Abstract: Safety alignment in Large Language Models (LLMs) inherently presents a multi-objective optimization conflict, often accompanied by an unintended degradation of general capabilities. Existing mitigation strategies typically rely on global gradient geom...
Learning to Reason: Temporal Saliency Distillation for Interpretable Knowledge Transfer
arXiv:2601.04263v1 Announce Type: new
Abstract: Knowledge distillation has proven effective for model compression by transferring knowledge from a larger network called the teacher to a smaller network called the student. Current knowledge distillation in time series is predominantly based on logit...
MemKD: Memory-Discrepancy Knowledge Distillation for Efficient Time Series Classification
arXiv:2601.04264v1 Announce Type: new
Abstract: Deep learning models, particularly recurrent neural networks and their variants, such as long short-term memory, have significantly advanced time series data analysis. These models capture complex, sequential patterns in time series, enabling real-tim...
Formal Analysis of AGI Decision-Theoretic Models and the Confrontation Question
arXiv:2601.04234v1 Announce Type: new
Abstract: Artificial General Intelligence (AGI) may face a confrontation question: under what conditions would a rationally self-interested AGI choose to seize power or eliminate human control (a confrontation) rather than remain cooperative? We formalize this ...
Actively Obtaining Environmental Feedback for Autonomous Action Evaluation Without Predefined Measurements
arXiv:2601.04235v1 Announce Type: new
Abstract: Obtaining reliable feedback from the environment is a fundamental capability for intelligent agents to evaluate the correctness of their actions and to accumulate reusable knowledge. However, most existing approaches rely on predefined measurements or...
SAGE-32B: Agentic Reasoning via Iterative Distillation
arXiv:2601.04237v1 Announce Type: new
Abstract: We demonstrate SAGE-32B, a 32 billion parameter language model that focuses on agentic reasoning and long range planning tasks. Unlike chat models that aim for general conversation fluency, SAGE-32B is designed to operate in an agentic loop, emphasizi...
arXiv:2601.04239v1 Announce Type: new
Abstract: The Cyclic Antibandwidth Problem (CABP), a variant of the Antibandwidth Problem, is an NP-hard graph labeling problem with numerous applications. Despite significant research efforts, existing state-of-the-art approaches for CABP are exclusively heuri...