MIMIC-RD: Can LLMs differentially diagnose rare diseases in real-world clinical settings?
arXiv:2601.11559v1 Announce Type: new
Abstract: Despite rare diseases affecting 1 in 10 Americans, their differential diagnosis remains challenging. Due to their impressive recall abilities, large language models (LLMs) have been recently explored for differential diagnosis. Existing approaches to ...
arXiv:2601.11604v1 Announce Type: new
Abstract: Multi-objective reinforcement learning (MORL) enables agents to optimize vector-valued rewards while respecting user preferences. CAPQL, a preference-conditioned actor-critic method, achieves this by conditioning on weight vectors w and restricts data...
GRADE: Replacing Policy Gradients with Backpropagation for LLM Alignment
arXiv:2601.11574v1 Announce Type: new
Abstract: Reinforcement learning from human feedback (RLHF) has become the dominant paradigm for aligning large language models with human preferences. However, policy gradient methods such as PPO suffer from high variance gradient estimates, requiring careful ...
Discrete Semantic States and Hamiltonian Dynamics in LLM Embedding Spaces
arXiv:2601.11572v1 Announce Type: new
Abstract: We investigate the structure of Large Language Model (LLM) embedding spaces using mathematical concepts, particularly linear algebra and the Hamiltonian formalism, drawing inspiration from analogies with quantum mechanical systems. Motivated by the ob...
AdaFRUGAL: Adaptive Memory-Efficient Training with Dynamic Control
arXiv:2601.11568v1 Announce Type: new
Abstract: Training Large Language Models (LLMs) is highly memory-intensive due to optimizer state overhead. The FRUGAL framework mitigates this with gradient splitting, but its static hyperparameters -- the subspace ratio ($\rho$) and update frequency ($T$) -- ...
CSyMR: Benchmarking Compositional Symbolic Muisc Reasoning With MIR Tool Integration
arXiv:2601.11556v1 Announce Type: new
Abstract: Large Language Models (LLMs) are leveraged in symbolic music reasoning, yet existing benchmarks emphasize isolated knowledge or atomic analyses rather than the integrative compositional reasoning needed to connect musical structures. To address this, ...
DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation
Diffusion large language models (dLLMs) are compelling alternatives to autoregressive (AR) models because their denoising models operate over the entire sequence. The global planning and iterative refinement features of dLLMs are particularly useful for code generation. However, current training and...
Stanford researchers have developed a deep learning model that transforms overwhelming brain data into clear trajectories, opening new possibilities for understanding thought, emotion, and neurological disease.
Multimodal reinforcement learning with agentic verifier for AI agents
Argos improves multimodal RL by evaluating whether an agent’s reasoning aligns with what it observes over time. The approach reduces visual hallucinations and produces more reliable, data-efficient agents for real-world applications.
The post Multimodal reinforcement learning with agentic verifier f...
Unbreakable? Researchers warn quantum computers have serious security flaws
Quantum computers could revolutionize everything from drug discovery to business analytics—but their incredible power also makes them surprisingly vulnerable. New research from Penn State warns that today’s quantum machines are not just futuristic tools, but potential gold mines for hackers. The stu...
Towards Tensor Network Models for Low-Latency Jet Tagging on FPGAs
arXiv:2601.10801v1 Announce Type: new
Abstract: We present a systematic study of Tensor Network (TN) models $\unicode{x2013}$ Matrix Product States (MPS) and Tree Tensor Networks (TTN) $\unicode{x2013}$ for real-time jet tagging in high-energy physics, with a focus on low-latency deployment on Fiel...
Unified Optimization of Source Weights and Transfer Quantities in Multi-Source Transfer Learning: An Asymptotic Framework
arXiv:2601.10779v1 Announce Type: new
Abstract: Transfer learning plays a vital role in improving model performance in data-scarce scenarios. However, naive uniform transfer from multiple source tasks may result in negative transfer, highlighting the need to properly balance the contributions of he...
Digital Metabolism: Decoupling Logic from Facts via Regenerative Unlearning -- Towards a Pure Neural Logic Core
arXiv:2601.10810v1 Announce Type: new
Abstract: Large language models (LLMs) currently suffer from parameter entanglement, where general reasoning capabilities (logic) and specific factual knowledge (facts) exist in a superposition state within shared weights. This coupling leads to the "memory wal...
Towards Reliable ML Feature Engineering via Planning in Constrained-Topology of LLM Agents
arXiv:2601.10820v1 Announce Type: new
Abstract: Recent advances in code generation models have unlocked unprecedented opportunities for automating feature engineering, yet their adoption in real-world ML teams remains constrained by critical challenges: (i) the scarcity of datasets capturing the it...
Japanese AI Agent System on Human Papillomavirus Vaccination: System Design
arXiv:2601.10718v1 Announce Type: new
Abstract: Human papillomavirus (HPV) vaccine hesitancy poses significant public health challenges, particularly in Japan where proactive vaccination recommendations were suspended from 2013 to 2021. The resulting information gap is exacerbated by misinformation...
Building AI Agents to Improve Job Referral Requests to Strangers
arXiv:2601.10726v1 Announce Type: new
Abstract: This paper develops AI agents that help job seekers write effective requests for job referrals in a professional online community. The basic workflow consists of an improver agent that rewrites the referral request and an evaluator agent that measures...
ORBITFLOW: SLO-Aware Long-Context LLM Serving with Fine-Grained KV Cache Reconfiguration
arXiv:2601.10729v1 Announce Type: new
Abstract: Serving long-context LLMs is challenging because request lengths and batch composition vary during token generation, causing the memory footprint to fluctuate significantly at runtime. Offloading KV caches to host memory limits effective memory usage,...
Do You Trust Me? Cognitive-Affective Signatures of Trustworthiness in Large Language Models
arXiv:2601.10719v1 Announce Type: new
Abstract: Perceived trustworthiness underpins how users navigate online information, yet it remains unclear whether large language models (LLMs),increasingly embedded in search, recommendation, and conversational systems, represent this construct in psychologic...
CTHA: Constrained Temporal Hierarchical Architecture for Stable Multi-Agent LLM Systems
arXiv:2601.10738v1 Announce Type: new
Abstract: Recently, multi-time-scale agent architectures have extended the ubiquitous single-loop paradigm by introducing temporal hierarchies with distinct cognitive layers. While yielding substantial performance gains, this diversification fundamentally compr...
The breakthrough that makes robot faces feel less creepy
Humans pay enormous attention to lips during conversation, and robots have struggled badly to keep up. A new robot developed at Columbia Engineering learned realistic lip movements by watching its own reflection and studying human videos online. This allowed it to speak and sing with synchronized fa...
arXiv:2601.09825v1 Announce Type: new
Abstract: We establish a lower bound on the eluder dimension of generalised linear model classes, showing that standard eluder dimension-based analysis cannot lead to first-order regret bounds. To address this, we introduce a localisation method for the eluder ...