Spectral Disentanglement and Enhancement: A Dual-domain Contrastive Framework for Representation Learning
arXiv:2602.09066v1 Announce Type: new
Abstract: Large-scale multimodal contrastive learning has recently achieved impressive success in learning rich and transferable representations, yet it remains fundamentally limited by the uniform treatment of feature dimensions and the neglect of the intrinsi...
Learning to Remember, Learn, and Forget in Attention-Based Models
arXiv:2602.09075v1 Announce Type: new
Abstract: In-Context Learning (ICL) in transformers acts as an online associative memory and is believed to underpin their high performance on complex sequence processing tasks. However, in gated linear attention models, this memory has a fixed capacity and is ...
Patient foundation model for risk stratification in low-risk overweight patients
arXiv:2602.09079v1 Announce Type: new
Abstract: Accurate risk stratification in patients with overweight or obesity is critical for guiding preventive care and allocating high-cost therapies such as GLP-1 receptor agonists. We present PatientTPP, a neural temporal point process (TPP) model trained ...
Looping Back to Move Forward: Recursive Transformers for Efficient and Flexible Large Multimodal Models
arXiv:2602.09080v1 Announce Type: new
Abstract: Large Multimodal Models (LMMs) have achieved remarkable success in vision-language tasks, yet their vast parameter counts are often underutilized during both training and inference. In this work, we embrace the idea of looping back to move forward: re...
A Small-Scale System for Autoregressive Program Synthesis Enabling Controlled Experimentation
arXiv:2602.09112v1 Announce Type: new
Abstract: What research can be pursued with small models trained to complete true programs? Typically, researchers study program synthesis via large language models (LLMs) which introduce issues such as knowing what is in or out of distribution, understanding f...
Uncertainty-Aware Multimodal Emotion Recognition through Dirichlet Parameterization
arXiv:2602.09121v1 Announce Type: new
Abstract: In this work, we present a lightweight and privacy-preserving Multimodal Emotion Recognition (MER) framework designed for deployment on edge devices. To demonstrate framework's versatility, our implementation uses three modalities - speech, text and f...
PABU: Progress-Aware Belief Update for Efficient LLM Agents
arXiv:2602.09138v1 Announce Type: new
Abstract: Large Language Model (LLM) agents commonly condition actions on full action-observation histories, which introduce task-irrelevant information that easily leads to redundant actions and higher inference cost. We propose Progress-Aware Belief Update (P...
CoMMa: Contribution-Aware Medical Multi-Agents From A Game-Theoretic Perspective
arXiv:2602.09159v1 Announce Type: new
Abstract: Recent multi-agent frameworks have broadened the ability to tackle oncology decision support tasks that require reasoning over dynamic, heterogeneous patient data. We propose Contribution-Aware Medical Multi-Agents (CoMMa), a decentralized LLM-agent f...
FlyAOC: Evaluating Agentic Ontology Curation of Drosophila Scientific Knowledge Bases
arXiv:2602.09163v1 Announce Type: new
Abstract: Scientific knowledge bases accelerate discovery by curating findings from primary literature into structured, queryable formats for both human researchers and emerging AI systems. Maintaining these resources requires expert curators to search relevant...
arXiv:2602.06993v1 Announce Type: new
Abstract: Transformers achieve strong language modeling accuracy, yet their position-wise feed-forward networks (FFNs) are dense, globally shared, and typically updated end to end. These properties create two practical tensions. First, dense FFNs spend the same...
Neural Sabermetrics with World Model: Play-by-play Predictive Modeling with Large Language Model
arXiv:2602.07030v1 Announce Type: new
Abstract: Classical sabermetrics has profoundly shaped baseball analytics by summarizing long histories of play into compact statistics. While these metrics are invaluable for valuation and retrospective analysis, they do not define a generative model of how ba...
TransConv-DDPM: Enhanced Diffusion Model for Generating Time-Series Data in Healthcare
arXiv:2602.07033v1 Announce Type: new
Abstract: The lack of real-world data in clinical fields poses a major obstacle in training effective AI models for diagnostic and preventive tools in medicine. Generative AI has shown promise in increasing data volume and enhancing model training, particularly...
AVERE: Improving Audiovisual Emotion Reasoning with Preference Optimization
arXiv:2602.07054v1 Announce Type: new
Abstract: Emotion understanding is essential for building socially intelligent agents. Although recent multimodal large language models have shown strong performance on this task, two key challenges remain - spurious associations between emotions and irrelevant...
LLM-FSM: Scaling Large Language Models for Finite-State Reasoning in RTL Code Generation
arXiv:2602.07032v1 Announce Type: new
Abstract: Finite-state reasoning, the ability to understand and implement state-dependent behavior, is central to hardware design. In this paper, we present LLM-FSM, a benchmark that evaluates how well large language models (LLMs) can recover finite-state machi...
ST-Raptor: An Agentic System for Semi-Structured Table QA
arXiv:2602.07034v1 Announce Type: new
Abstract: Semi-structured table question answering (QA) is a challenging task that requires (1) precise extraction of cell contents and positions and (2) accurate recovery of key implicit logical structures, hierarchical relationships, and semantic associations...
DLLM-Searcher: Adapting Diffusion Large Language Model for Search Agents
arXiv:2602.07035v1 Announce Type: new
Abstract: Recently, Diffusion Large Language Models (dLLMs) have demonstrated unique efficiency advantages, enabled by their inherently parallel decoding mechanism and flexible generation paradigm. Meanwhile, despite the rapid advancement of Search Agents, thei...
Aster: Autonomous Scientific Discovery over 20x Faster Than Existing Methods
arXiv:2602.07040v1 Announce Type: new
Abstract: We introduce Aster, an AI agent for autonomous scientific discovery capable of operating over 20 times faster than existing frameworks. Given a task, an initial program, and a script to evaluate the performance of the program, Aster iteratively improv...
Theory of Space: Can Foundation Models Construct Spatial Beliefs through Active Exploration?
arXiv:2602.07055v1 Announce Type: new
Abstract: Spatial embodied intelligence requires agents to act to acquire information under partial observability. While multimodal foundation models excel at passive perception, their capacity for active, self-directed exploration remains understudied. We prop...
Parallel Track Transformers: Enabling Fast GPU Inference with Reduced Synchronization
Efficient large-scale inference of transformer-based large language models (LLMs) remains a fundamental systems challenge, frequently requiring multi-GPU parallelism to meet stringent latency and throughput targets. Conventional tensor parallelism decomposes matrix operations across devices but intr...
NanoNet: Parameter-Efficient Learning with Label-Scarce Supervision for Lightweight Text Mining Model
arXiv:2602.06093v1 Announce Type: new
Abstract: The lightweight semi-supervised learning (LSL) strategy provides an effective approach of conserving labeled samples and minimizing model inference costs. Prior research has effectively applied knowledge transfer learning and co-training regularizatio...
Agentic Workflow Using RBA$_\theta$ for Event Prediction
arXiv:2602.06097v1 Announce Type: new
Abstract: Wind power ramp events are difficult to forecast due to strong variability, multi-scale dynamics, and site-specific meteorological effects. This paper proposes an event-first, frequency-aware forecasting paradigm that directly predicts ramp events and...
Toward Faithful and Complete Answer Construction from a Single Document
arXiv:2602.06103v1 Announce Type: new
Abstract: Modern large language models (LLMs) are powerful generators driven by statistical next-token prediction. While effective at producing fluent text, this design biases models toward high-probability continuations rather than exhaustive and faithful answ...
Pragmatic Curiosity: A Hybrid Learning-Optimization Paradigm via Active Inference
arXiv:2602.06104v1 Announce Type: new
Abstract: Many engineering and scientific workflows depend on expensive black-box evaluations, requiring decision-making that simultaneously improves performance and reduces uncertainty. Bayesian optimization (BO) and Bayesian experimental design (BED) offer po...