The Forgotten Shield: Safety Grafting in Parameter-Space for Medical MLLMs
arXiv:2601.04199v1 Announce Type: new
Abstract: Medical Multimodal Large Language Models (Medical MLLMs) have achieved remarkable progress in specialized medical tasks; however, research into their safety has lagged, posing potential risks for real-world deployment. In this paper, we first establis...
MemKD: Memory-Discrepancy Knowledge Distillation for Efficient Time Series Classification
arXiv:2601.04264v1 Announce Type: new
Abstract: Deep learning models, particularly recurrent neural networks and their variants, such as long short-term memory, have significantly advanced time series data analysis. These models capture complex, sequential patterns in time series, enabling real-tim...
Green MLOps: Closed-Loop, Energy-Aware Inference with NVIDIA Triton, FastAPI, and Bio-Inspired Thresholding
arXiv:2601.04250v1 Announce Type: new
Abstract: Energy efficiency is a first-order concern in AI deployment, as long-running inference can exceed training in cumulative carbon impact. We propose a bio-inspired framework that maps protein-folding energy basins to inference cost landscapes and contro...
SAGE-32B: Agentic Reasoning via Iterative Distillation
arXiv:2601.04237v1 Announce Type: new
Abstract: We demonstrate SAGE-32B, a 32 billion parameter language model that focuses on agentic reasoning and long range planning tasks. Unlike chat models that aim for general conversation fluency, SAGE-32B is designed to operate in an agentic loop, emphasizi...
arXiv:2601.04239v1 Announce Type: new
Abstract: The Cyclic Antibandwidth Problem (CABP), a variant of the Antibandwidth Problem, is an NP-hard graph labeling problem with numerous applications. Despite significant research efforts, existing state-of-the-art approaches for CABP are exclusively heuri...
Formal Analysis of AGI Decision-Theoretic Models and the Confrontation Question
arXiv:2601.04234v1 Announce Type: new
Abstract: Artificial General Intelligence (AGI) may face a confrontation question: under what conditions would a rationally self-interested AGI choose to seize power or eliminate human control (a confrontation) rather than remain cooperative? We formalize this ...
Actively Obtaining Environmental Feedback for Autonomous Action Evaluation Without Predefined Measurements
arXiv:2601.04235v1 Announce Type: new
Abstract: Obtaining reliable feedback from the environment is a fundamental capability for intelligent agents to evaluate the correctness of their actions and to accumulate reusable knowledge. However, most existing approaches rely on predefined measurements or...
Lightweight Transformer Architectures for Edge Devices in Real-Time Applications
arXiv:2601.03290v1 Announce Type: new
Abstract: The deployment of transformer-based models on resource-constrained edge devices represents a critical challenge in enabling real-time artificial intelligence applications. This comprehensive survey examines lightweight transformer architectures specif...
Ratio-Variance Regularized Policy Optimization for Efficient LLM Fine-tuning
arXiv:2601.03320v1 Announce Type: new
Abstract: On-policy reinforcement learning (RL), particularly Proximal Policy Optimization (PPO) and Group Relative Policy Optimization (GRPO), has become the dominant paradigm for fine-tuning large language models (LLMs). While policy ratio clipping stabilizes...
Mastering the Game of Go with Self-play Experience Replay
arXiv:2601.03306v1 Announce Type: new
Abstract: The game of Go has long served as a benchmark for artificial intelligence, demanding sophisticated strategic reasoning and long-term planning. Previous approaches such as AlphaGo and its successors, have predominantly relied on model-based Monte-Carlo...
Toward Maturity-Based Certification of Embodied AI: Quantifying Trustworthiness Through Measurement Mechanisms
arXiv:2601.03470v2 Announce Type: new
Abstract: We propose a maturity-based framework for certifying embodied AI systems through explicit measurement mechanisms. We argue that certifiable embodied AI requires structured assessment frameworks, quantitative scoring mechanisms, and methods for navigat...
Digital Red Queen: Adversarial Program Evolution in Core War with LLMs
arXiv:2601.03335v1 Announce Type: new
Abstract: Large language models (LLMs) are increasingly being used to evolve solutions to problems in many domains, in a process inspired by biological evolution. However, unlike biological evolution, most LLM-evolution frameworks are formulated as static optim...
Exploration Through Introspection: A Self-Aware Reward Model
arXiv:2601.03389v1 Announce Type: new
Abstract: Understanding how artificial agents model internal mental states is central to advancing Theory of Mind in AI. Evidence points to a unified system for self- and other-awareness. We explore this self-awareness by having reinforcement learning agents in...
Enhancing LLM Instruction Following: An Evaluation-Driven Multi-Agentic Workflow for Prompt Instructions Optimization
arXiv:2601.03359v1 Announce Type: new
Abstract: Large Language Models (LLMs) often generate substantively relevant content but fail to adhere to formal constraints, leading to outputs that are conceptually correct but procedurally flawed. Traditional prompt refinement approaches focus on rephrasing...
Polynomial Convergence of Riemannian Diffusion Models
arXiv:2601.02499v1 Announce Type: new
Abstract: Diffusion models have demonstrated remarkable empirical success in the recent years and are considered one of the state-of-the-art generative models in modern AI. These models consist of a forward process, which gradually diffuses the data distributio...
arXiv:2601.02433v1 Announce Type: new
Abstract: Digital AI systems spanning large language models, vision models, and generative architectures that operate primarily in symbolic, linguistic, or pixel domains. They have achieved striking progress, but almost all of this progress lives in virtual spa...
WebGym: Scaling Training Environments for Visual Web Agents with Realistic Tasks
arXiv:2601.02439v1 Announce Type: new
Abstract: We present WebGym, the largest-to-date open-source environment for training realistic visual web agents. Real websites are non-stationary and diverse, making artificial or small-scale task sets insufficient for robust policy learning. WebGym contains ...
GEM-Style Constraints for PEFT with Dual Gradient Projection in LoRA
arXiv:2601.02500v1 Announce Type: new
Abstract: Full fine-tuning of Large Language Models (LLMs) is computationally costly, motivating Continual Learning (CL) approaches that utilize parameter-efficient adapters. We revisit Gradient Episodic Memory (GEM) within the Low-Rank Adapter (LoRA) subspace ...
Orchestral AI: A Framework for Agent Orchestration
arXiv:2601.02577v1 Announce Type: new
Abstract: The rapid proliferation of LLM agent frameworks has forced developers to choose between vendor lock-in through provider-specific SDKs and complex multi-package ecosystems that obscure control flow and hinder reproducibility. Integrating tool calling a...
AWARE-US: Benchmark for Preference-Aware Resolution in Tool-Calling Agents
arXiv:2601.02643v1 Announce Type: new
Abstract: Tool-calling conversational agents querying structured databases often face two linked failures: underspecification (missing constraints needed to run a precise query) and infeasibility (the fully specified query returns an empty set because no item s...
SimpleMem: Efficient Lifelong Memory for LLM Agents
arXiv:2601.02553v1 Announce Type: new
Abstract: To support reliable long-term interaction in complex environments, LLM agents require memory systems that efficiently manage historical experiences. Existing approaches either retain full interaction histories via passive context extension, leading to...
Textual Explanations and Their Evaluations for Reinforcement Learning Policy
arXiv:2601.02514v1 Announce Type: new
Abstract: Understanding a Reinforcement Learning (RL) policy is crucial for ensuring that autonomous agents behave according to human expectations. This goal can be achieved using Explainable Reinforcement Learning (XRL) techniques. Although textual explanation...
Intrinsic-Metric Physics-Informed Neural Networks (IM-PINN) for Reaction-Diffusion Dynamics on Complex Riemannian Manifolds
arXiv:2601.00834v1 Announce Type: new
Abstract: Simulating nonlinear reaction-diffusion dynamics on complex, non-Euclidean manifolds remains a fundamental challenge in computational morphogenesis, constrained by high-fidelity mesh generation costs and symplectic drift in discrete time-stepping sche...
ShrimpXNet: A Transfer Learning Framework for Shrimp Disease Classification with Augmented Regularization, Adversarial Training, and Explainable AI
arXiv:2601.00832v1 Announce Type: new
Abstract: Shrimp is one of the most widely consumed aquatic species globally, valued for both its nutritional content and economic importance. Shrimp farming represents a significant source of income in many regions; however, like other forms of aquaculture, it...