PCN-Rec: Agentic Proof-Carrying Negotiation for Reliable Governance-Constrained Recommendation
arXiv:2601.09771v1 Announce Type: new
Abstract: Modern LLM-based recommenders can generate compelling ranked lists, but they struggle to reliably satisfy governance constraints such as minimum long-tail exposure or diversity requirements. We present PCN-Rec, a proof-carrying negotiation pipeline th...
Improving Chain-of-Thought for Logical Reasoning via Attention-Aware Intervention
arXiv:2601.09805v1 Announce Type: new
Abstract: Modern logical reasoning with LLMs primarily relies on employing complex interactive frameworks that decompose the reasoning process into subtasks solved through carefully designed prompts or requiring external resources (e.g., symbolic solvers) to ex...
Antisocial behavior towards large language model users: experimental evidence
arXiv:2601.09772v1 Announce Type: new
Abstract: The rapid spread of large language models (LLMs) has raised concerns about the social reactions they provoke. Prior research documents negative attitudes toward AI users, but it remains unclear whether such disapproval translates into costly action. W...
Spectral Generative Flow Models: A Physics-Inspired Replacement for Vectorized Large Language Models
arXiv:2601.08893v1 Announce Type: new
Abstract: We introduce Spectral Generative Flow Models (SGFMs), a physics-inspired alternative to transformer-based large language models. Instead of representing text or video as sequences of discrete tokens processed by attention, SGFMs treat generation as th...
XGBoost Forecasting of NEPSE Index Log Returns with Walk Forward Validation
arXiv:2601.08896v1 Announce Type: new
Abstract: This study develops a robust machine learning framework for one-step-ahead forecasting of daily log-returns in the Nepal Stock Exchange (NEPSE) Index using the XGBoost regressor. A comprehensive feature set is engineered, including lagged log-returns ...
Breaking the Bottlenecks: Scalable Diffusion Models for 3D Molecular Generation
arXiv:2601.08963v1 Announce Type: new
Abstract: Diffusion models have emerged as a powerful class of generative models for molecular design, capable of capturing complex structural distributions and achieving high fidelity in 3D molecule generation. However, their widespread use remains constrained...
DriftGuard: A Hierarchical Framework for Concept Drift Detection and Remediation in Supply Chain Forecasting
arXiv:2601.08928v1 Announce Type: new
Abstract: Supply chain forecasting models degrade over time as real-world conditions change. Promotions shift, consumer preferences evolve, and supply disruptions alter demand patterns, causing what is known as concept drift. This silent degradation leads to st...
arXiv:2601.09072v1 Announce Type: new
Abstract: Developing safe, effective, and practically useful clinical prediction models (CPMs) traditionally requires iterative collaboration between clinical experts, data scientists, and informaticists. This process refines the often small but critical detail...
Programming over Thinking: Efficient and Robust Multi-Constraint Planning
arXiv:2601.09097v1 Announce Type: new
Abstract: Multi-constraint planning involves identifying, evaluating, and refining candidate plans while satisfying multiple, potentially conflicting constraints. Existing large language model (LLM) approaches face fundamental limitations in this domain. Pure r...
ConvoLearn: A Dataset of Constructivist Tutor-Student Dialogue
arXiv:2601.08950v1 Announce Type: new
Abstract: In educational applications, LLMs exhibit several fundamental pedagogical limitations, such as their tendency to reveal solutions rather than support dialogic learning. We introduce ConvoLearn (https://huggingface.co/datasets/masharma/convolearn ), a ...
ART: Action-based Reasoning Task Benchmarking for Medical AI Agents
arXiv:2601.08988v1 Announce Type: new
Abstract: Reliable clinical decision support requires medical AI agents capable of safe, multi-step reasoning over structured electronic health records (EHRs). While large language models (LLMs) show promise in healthcare, existing benchmarks inadequately asses...
The Hierarchy of Agentic Capabilities: Evaluating Frontier Models on Realistic RL Environments
arXiv:2601.09032v1 Announce Type: new
Abstract: The advancement of large language model (LLM) based agents has shifted AI evaluation from single-turn response assessment to multi-step task completion in interactive environments. We present an empirical study evaluating frontier AI models on 150 wor...
Hierarchical Sparse Plus Low Rank Compression of LLM
arXiv:2601.07839v1 Announce Type: new
Abstract: Modern large language models (LLMs) place extraordinary pressure on memory and compute budgets, making principled compression indispensable for both deployment and continued training. We present Hierarchical Sparse Plus Low-Rank (HSS) compression, a t...
Multiplicative Orthogonal Sequential Editing for Language Models
arXiv:2601.07873v1 Announce Type: new
Abstract: Knowledge editing aims to efficiently modify the internal knowledge of large language models (LLMs) without compromising their other capabilities. The prevailing editing paradigm, which appends an update matrix to the original parameter matrix, has be...
RewriteNets: End-to-End Trainable String-Rewriting for Generative Sequence Modeling
arXiv:2601.07868v1 Announce Type: new
Abstract: Dominant sequence models like the Transformer represent structure implicitly through dense attention weights, incurring quadratic complexity. We propose RewriteNets, a novel neural architecture built on an alternative paradigm: explicit, parallel stri...
When Models Know When They Do Not Know: Calibration, Cascading, and Cleaning
arXiv:2601.07965v1 Announce Type: new
Abstract: When a model knows when it does not know, many possibilities emerge. The first question is how to enable a model to recognize that it does not know. A promising approach is to use confidence, computed from the model's internal signals, to reflect its ...
Reasoning over Precedents Alongside Statutes: Case-Augmented Deliberative Alignment for LLM Safety
arXiv:2601.08000v1 Announce Type: new
Abstract: Ensuring that Large Language Models (LLMs) adhere to safety principles without refusing benign requests remains a significant challenge. While OpenAI introduces deliberative alignment (DA) to enhance the safety of its o-series models through reasoning...
Bridging the Trust Gap: Clinician-Validated Hybrid Explainable AI for Maternal Health Risk Assessment in Bangladesh
arXiv:2601.07866v1 Announce Type: new
Abstract: While machine learning shows promise for maternal health risk prediction, clinical adoption in resource-constrained settings faces a critical barrier: lack of explainability and trust. This study presents a hybrid explainable AI (XAI) framework combin...
Executable Ontologies in Game Development: From Algorithmic Control to Semantic World Modeling
arXiv:2601.07964v1 Announce Type: new
Abstract: This paper examines the application of Executable Ontologies (EO), implemented through the boldsea framework, to game development. We argue that EO represents a paradigm shift: a transition from algorithmic behavior programming to semantic world model...
CrossTrafficLLM: A Human-Centric Framework for Interpretable Traffic Intelligence via Large Language Model
arXiv:2601.06042v1 Announce Type: new
Abstract: While accurate traffic forecasting is vital for Intelligent Transportation Systems (ITS), effectively communicating predicted conditions via natural language for human-centric decision support remains a challenge and is often handled separately. To ad...
Enabling Long FFT Convolutions on Memory-Constrained FPGAs via Chunking
arXiv:2601.06065v1 Announce Type: new
Abstract: The need for long-context reasoning has led to alternative neural network architectures besides Transformers and self-attention, a popular model being Hyena, which employs causal 1D-convolutions implemented with FFTs. Long convolutions enable efficien...
Tree-Preconditioned Differentiable Optimization and Axioms as Layers
arXiv:2601.06036v1 Announce Type: new
Abstract: This paper introduces a differentiable framework that embeds the axiomatic structure of Random Utility Models (RUM) directly into deep neural networks. Although projecting empirical choice data onto the RUM polytope is NP-hard in general, we uncover a...
Filtering Beats Fine Tuning: A Bayesian Kalman View of In Context Learning in LLMs
arXiv:2601.06100v1 Announce Type: new
Abstract: We present a theory-first framework that interprets inference-time adaptation in large language models (LLMs) as online Bayesian state estimation. Rather than modeling rapid adaptation as implicit optimization or meta-learning, we formulate task- and ...
From RLHF to Direct Alignment: A Theoretical Unification of Preference Learning for Large Language Models
arXiv:2601.06108v1 Announce Type: new
Abstract: Aligning large language models (LLMs) with human preferences has become essential for safe and beneficial AI deployment. While Reinforcement Learning from Human Feedback (RLHF) established the dominant paradigm, a proliferation of alternatives -- Dire...