Improving Interactive In-Context Learning from Natural Language Feedback
arXiv:2602.16066v1 Announce Type: new
Abstract: Adapting one's thought process based on corrective feedback is an essential ability in human learning, particularly in collaborative settings. In contrast, the current large language model training paradigm relies heavily on modeling vast, static corp...
Refine Now, Query Fast: A Decoupled Refinement Paradigm for Implicit Neural Fields
arXiv:2602.15155v1 Announce Type: new
Abstract: Implicit Neural Representations (INRs) have emerged as promising surrogates for large 3D scientific simulations due to their ability to continuously model spatial and conditional fields, yet they face a critical fidelity-speed dilemma: deep MLPs suffe...
PolyNODE: Variable-dimension Neural ODEs on M-polyfolds
arXiv:2602.15128v1 Announce Type: new
Abstract: Neural ordinary differential equations (NODEs) are geometric deep learning models based on dynamical systems and flows generated by vector fields on manifolds. Despite numerous successful applications, particularly within the flow matching paradigm, a...
Near-Optimal Sample Complexity for Online Constrained MDPs
arXiv:2602.15076v1 Announce Type: new
Abstract: Safety is a fundamental challenge in reinforcement learning (RL), particularly in real-world applications such as autonomous driving, robotics, and healthcare. To address this, Constrained Markov Decision Processes (CMDPs) are commonly used to enforce...
Hybrid Feature Learning with Time Series Embeddings for Equipment Anomaly Prediction
arXiv:2602.15089v1 Announce Type: new
Abstract: In predictive maintenance of equipment, deep learning-based time series anomaly detection has garnered significant attention; however, pure deep learning approaches often fail to achieve sufficient accuracy on real-world data. This study proposes a hy...
Protecting Language Models Against Unauthorized Distillation through Trace Rewriting
arXiv:2602.15143v1 Announce Type: new
Abstract: Knowledge distillation is a widely adopted technique for transferring capabilities from LLMs to smaller, more efficient student models. However, unauthorized use of knowledge distillation takes unfair advantage of the considerable effort and cost put ...
ResearchGym: Evaluating Language Model Agents on Real-World AI Research
arXiv:2602.15112v1 Announce Type: new
Abstract: We introduce ResearchGym, a benchmark and execution environment for evaluating AI agents on end-to-end research. To instantiate this, we repurpose five oral and spotlight papers from ICML, ICLR, and ACL. From each paper's repository, we preserve the d...
da Costa and Tarski meet Goguen and Carnap: a novel approach for ontological heterogeneity based on consequence systems
arXiv:2602.15158v1 Announce Type: new
Abstract: This paper presents a novel approach for ontological heterogeneity that draws heavily from Carnapian-Goguenism, as presented by Kutz, Mossakowski and L\"ucke (2010). The approach is provisionally designated da Costian-Tarskianism, named after da Costa...
Attention-gated U-Net model for semantic segmentation of brain tumors and feature extraction for survival prognosis
arXiv:2602.15067v1 Announce Type: new
Abstract: Gliomas, among the most common primary brain tumors, vary widely in aggressiveness, prognosis, and histology, making treatment challenging due to complex and time-intensive surgical interventions. This study presents an Attention-Gated Recurrent Resid...
Panini: Continual Learning in Token Space via Structured Memory
arXiv:2602.15156v1 Announce Type: new
Abstract: Language models are increasingly used to reason over content they were not trained on, such as new documents, evolving knowledge, and user-specific data. A common approach is retrieval-augmented generation (RAG), which stores verbatim documents extern...
Accelerated Discovery of Cryoprotectant Cocktails via Multi-Objective Bayesian Optimization
arXiv:2602.13398v1 Announce Type: new
Abstract: Designing cryoprotectant agent (CPA) cocktails for vitrification is challenging because formulations must be concentrated enough to suppress ice formation yet non-toxic enough to preserve cell viability. This tradeoff creates a large, multi-objective ...
Directional Concentration Uncertainty: A representational approach to uncertainty quantification for generative models
arXiv:2602.13264v1 Announce Type: new
Abstract: In the critical task of making generative models trustworthy and robust, methods for Uncertainty Quantification (UQ) have begun to show encouraging potential. However, many of these methods rely on rigid heuristics that fail to generalize across tasks...
BLUEPRINT Rebuilding a Legacy: Multimodal Retrieval for Complex Engineering Drawings and Documents
arXiv:2602.13345v1 Announce Type: new
Abstract: Decades of engineering drawings and technical records remain locked in legacy archives with inconsistent or missing metadata, making retrieval difficult and often manual. We present Blueprint, a layout-aware multimodal retrieval system designed for la...
The Speed-up Factor: A Quantitative Multi-Iteration Active Learning Performance Metric
arXiv:2602.13359v1 Announce Type: new
Abstract: Machine learning models excel with abundant annotated data, but annotation is often costly and time-intensive. Active learning (AL) aims to improve the performance-to-annotation ratio by using query methods (QMs) to iteratively select the most informa...
Exploring the Performance of ML/DL Architectures on the MNIST-1D Dataset
arXiv:2602.13348v1 Announce Type: new
Abstract: Small datasets like MNIST have historically been instrumental in advancing machine learning research by providing a controlled environment for rapid experimentation and model evaluation. However, their simplicity often limits their utility for disting...
VeRA: Verified Reasoning Data Augmentation at Scale
arXiv:2602.13217v1 Announce Type: new
Abstract: The main issue with most evaluation schemes today is their "static" nature: the same problems are reused repeatedly, allowing for memorization, format exploitation, and eventual saturation. To measure genuine AI progress, we need evaluation that is ro...
Scaling the Scaling Logic: Agentic Meta-Synthesis of Logic Reasoning
arXiv:2602.13218v1 Announce Type: new
Abstract: Scaling verifiable training signals remains a key bottleneck for Reinforcement Learning from Verifiable Rewards (RLVR). Logical reasoning is a natural substrate: constraints are formal and answers are programmatically checkable. However, prior synthes...
When to Think Fast and Slow? AMOR: Entropy-Based Metacognitive Gate for Dynamic SSM-Attention Switching
arXiv:2602.13215v1 Announce Type: new
Abstract: Transformers allocate uniform computation to every position, regardless of difficulty. State Space Models (SSMs) offer efficient alternatives but struggle with precise information retrieval over a long horizon. Inspired by dual-process theories of cog...
BotzoneBench: Scalable LLM Evaluation via Graded AI Anchors
arXiv:2602.13214v1 Announce Type: new
Abstract: Large Language Models (LLMs) are increasingly deployed in interactive environments requiring strategic decision-making, yet systematic evaluation of these capabilities remains challenging. Existing benchmarks for LLMs primarily assess static reasoning...
Agentic AI for Commercial Insurance Underwriting with Adversarial Self-Critique
arXiv:2602.13213v1 Announce Type: new
Abstract: Commercial insurance underwriting is a labor-intensive process that requires manual review of extensive documentation to assess risk and determine policy pricing. While AI offers substantial efficiency improvements, existing solutions lack comprehensi...
Abstractive Red-Teaming of Language Model Character
arXiv:2602.12318v1 Announce Type: new
Abstract: We want language model assistants to conform to a character specification, which asserts how the model should act across diverse user interactions. While models typically follow these character specifications, they can occasionally violate them in lar...
OptiML: An End-to-End Framework for Program Synthesis and CUDA Kernel Optimization
arXiv:2602.12305v1 Announce Type: new
Abstract: Generating high-performance CUDA kernels remains challenging due to the need to navigate a combinatorial space of low-level transformations under noisy and expensive hardware feedback. Although large language models can synthesize functionally correct...
Intrinsic Credit Assignment for Long Horizon Interaction
arXiv:2602.12342v1 Announce Type: new
Abstract: How can we train agents to navigate uncertainty over long horizons? In this work, we propose {\Delta}Belief-RL, which leverages a language model's own intrinsic beliefs to reward intermediate progress. Our method utilizes the change in the probability...
The Appeal and Reality of Recycling LoRAs with Adaptive Merging
arXiv:2602.12323v1 Announce Type: new
Abstract: The widespread availability of fine-tuned LoRA modules for open pre-trained models has led to an interest in methods that can adaptively merge LoRAs to improve performance. These methods typically include some way of selecting LoRAs from a pool and tu...