Antisocial behavior towards large language model users: experimental evidence
arXiv:2601.09772v1 Announce Type: new
Abstract: The rapid spread of large language models (LLMs) has raised concerns about the social reactions they provoke. Prior research documents negative attitudes toward AI users, but it remains unclear whether such disapproval translates into costly action. W...
PCN-Rec: Agentic Proof-Carrying Negotiation for Reliable Governance-Constrained Recommendation
arXiv:2601.09771v1 Announce Type: new
Abstract: Modern LLM-based recommenders can generate compelling ranked lists, but they struggle to reliably satisfy governance constraints such as minimum long-tail exposure or diversity requirements. We present PCN-Rec, a proof-carrying negotiation pipeline th...
GUI-Eyes: Tool-Augmented Perception for Visual Grounding in GUI Agents
arXiv:2601.09770v1 Announce Type: new
Abstract: Recent advances in vision-language models (VLMs) and reinforcement learning (RL) have driven progress in GUI automation. However, most existing methods rely on static, one-shot visual inputs and passive perception, lacking the ability to adaptively de...
AI Survival Stories: a Taxonomic Analysis of AI Existential Risk
arXiv:2601.09765v1 Announce Type: new
Abstract: Since the release of ChatGPT, there has been a lot of debate about whether AI systems pose an existential risk to humanity. This paper develops a general framework for thinking about the existential risk of AI systems. We analyze a two premise argumen...
arXiv:2601.09825v1 Announce Type: new
Abstract: We establish a lower bound on the eluder dimension of generalised linear model classes, showing that standard eluder dimension-based analysis cannot lead to first-order regret bounds. To address this, we introduce a localisation method for the eluder ...
arXiv:2601.09809v1 Announce Type: new
Abstract: Organizations and enterprises across domains such as healthcare, finance, and scientific research are increasingly required to extract collective intelligence from distributed, siloed datasets while adhering to strict privacy, regulatory, and sovereig...
TimeSAE: Sparse Decoding for Faithful Explanations of Black-Box Time Series Models
arXiv:2601.09776v1 Announce Type: new
Abstract: As black box models and pretrained models gain traction in time series applications, understanding and explaining their predictions becomes increasingly vital, especially in high-stakes domains where interpretability and trust are essential. However, ...
The Geometry of Thought: Disclosing the Transformer as a Tropical Polynomial Circuit
arXiv:2601.09775v1 Announce Type: new
Abstract: We prove that the Transformer self-attention mechanism in the high-confidence regime ($\beta \to \infty$, where $\beta$ is an inverse temperature) operates in the tropical semiring (max-plus algebra). In particular, we show that taking the tropical li...
Social Determinants of Health Prediction for ICD-9 Code with Reasoning Models
arXiv:2601.09709v1 Announce Type: new
Abstract: Social Determinants of Health correlate with patient outcomes but are rarely captured in structured data. Recent attention has been given to automatically extracting these markers from clinical text to supplement diagnostic systems with knowledge of p...
The Data-Quality Illusion: Rethinking Classifier-Based Quality Filtering for LLM Pretraining
Large-scale models are pretrained on massive web-crawled datasets containing documents of mixed quality, making data filtering essential. A popular method is Classifier-based Quality Filtering (CQF), which trains a binary classifier to distinguish between pretraining data and a small, high-quality s...
ParaRNN: Unlocking Parallel Training of Nonlinear RNNs for Large Language Models
Recurrent Neural Networks (RNNs) laid the foundation for sequence modeling, but their intrinsic sequential nature restricts parallel computation, creating a fundamental barrier to scaling. This has led to the dominance of parallelizable architectures like Transformers and, more recently, State Space...
OptiMind: A small language model with optimization expertise
OptiMind is a small language model that converts business operation challenges, described naturally, into mathematical formulations that optimization software can solve. It reduces formulation time & errors & enables fast, privacy-preserving local use.
The post OptiMind: A small language model with ...
arXiv:2601.09072v1 Announce Type: new
Abstract: Developing safe, effective, and practically useful clinical prediction models (CPMs) traditionally requires iterative collaboration between clinical experts, data scientists, and informaticists. This process refines the often small but critical detail...
Spectral Generative Flow Models: A Physics-Inspired Replacement for Vectorized Large Language Models
arXiv:2601.08893v1 Announce Type: new
Abstract: We introduce Spectral Generative Flow Models (SGFMs), a physics-inspired alternative to transformer-based large language models. Instead of representing text or video as sequences of discrete tokens processed by attention, SGFMs treat generation as th...
XGBoost Forecasting of NEPSE Index Log Returns with Walk Forward Validation
arXiv:2601.08896v1 Announce Type: new
Abstract: This study develops a robust machine learning framework for one-step-ahead forecasting of daily log-returns in the Nepal Stock Exchange (NEPSE) Index using the XGBoost regressor. A comprehensive feature set is engineered, including lagged log-returns ...
DriftGuard: A Hierarchical Framework for Concept Drift Detection and Remediation in Supply Chain Forecasting
arXiv:2601.08928v1 Announce Type: new
Abstract: Supply chain forecasting models degrade over time as real-world conditions change. Promotions shift, consumer preferences evolve, and supply disruptions alter demand patterns, causing what is known as concept drift. This silent degradation leads to st...
Breaking the Bottlenecks: Scalable Diffusion Models for 3D Molecular Generation
arXiv:2601.08963v1 Announce Type: new
Abstract: Diffusion models have emerged as a powerful class of generative models for molecular design, capable of capturing complex structural distributions and achieving high fidelity in 3D molecule generation. However, their widespread use remains constrained...
ART: Action-based Reasoning Task Benchmarking for Medical AI Agents
arXiv:2601.08988v1 Announce Type: new
Abstract: Reliable clinical decision support requires medical AI agents capable of safe, multi-step reasoning over structured electronic health records (EHRs). While large language models (LLMs) show promise in healthcare, existing benchmarks inadequately asses...
ConvoLearn: A Dataset of Constructivist Tutor-Student Dialogue
arXiv:2601.08950v1 Announce Type: new
Abstract: In educational applications, LLMs exhibit several fundamental pedagogical limitations, such as their tendency to reveal solutions rather than support dialogic learning. We introduce ConvoLearn (https://huggingface.co/datasets/masharma/convolearn ), a ...
Programming over Thinking: Efficient and Robust Multi-Constraint Planning
arXiv:2601.09097v1 Announce Type: new
Abstract: Multi-constraint planning involves identifying, evaluating, and refining candidate plans while satisfying multiple, potentially conflicting constraints. Existing large language model (LLM) approaches face fundamental limitations in this domain. Pure r...
The Hierarchy of Agentic Capabilities: Evaluating Frontier Models on Realistic RL Environments
arXiv:2601.09032v1 Announce Type: new
Abstract: The advancement of large language model (LLM) based agents has shifted AI evaluation from single-turn response assessment to multi-step task completion in interactive environments. We present an empirical study evaluating frontier AI models on 150 wor...
Hierarchical Sparse Plus Low Rank Compression of LLM
arXiv:2601.07839v1 Announce Type: new
Abstract: Modern large language models (LLMs) place extraordinary pressure on memory and compute budgets, making principled compression indispensable for both deployment and continued training. We present Hierarchical Sparse Plus Low-Rank (HSS) compression, a t...
RewriteNets: End-to-End Trainable String-Rewriting for Generative Sequence Modeling
arXiv:2601.07868v1 Announce Type: new
Abstract: Dominant sequence models like the Transformer represent structure implicitly through dense attention weights, incurring quadratic complexity. We propose RewriteNets, a novel neural architecture built on an alternative paradigm: explicit, parallel stri...
Multiplicative Orthogonal Sequential Editing for Language Models
arXiv:2601.07873v1 Announce Type: new
Abstract: Knowledge editing aims to efficiently modify the internal knowledge of large language models (LLMs) without compromising their other capabilities. The prevailing editing paradigm, which appends an update matrix to the original parameter matrix, has be...