AI to Learn 2.0: A Deliverable-Oriented Governance Framework and Maturity Rubric for Opaque AI in Learning-Intensive Domains
arXiv:2604.19751v1 Announce Type: new
Abstract: Generative AI is entering research, education, and professional work faster than current governance frameworks can specify how AI-assisted outputs should be judged in learning-intensive settings. The central problem is proxy failure: a polished artifa...
Learning Long-Term Motion Embeddings for Efficient Kinematics Generation
Understanding and predicting motion is a fundamental component of visual intelligence. Although modern video models exhibit strong comprehension of scene dynamics, exploring multiple possible futures through full video synthesis remains prohibitively inefficient. We model scene dynamics orders of ma...
Expert Upcycling: Shifting the Compute-Efficient Frontier of Mixture-of-Experts
arXiv:2604.19835v1 Announce Type: new
Abstract: Mixture-of-Experts (MoE) has become the dominant architecture for scaling large language models: frontier models routinely decouple total parameters from per-token computation through sparse expert routing. Scaling laws show that under fixed active co...
Transparent Screening for LLM Inference and Training Impacts
arXiv:2604.19757v1 Announce Type: new
Abstract: This paper presents a transparent screening framework for estimating inference and training impacts of current large language models under limited observability. The framework converts natural-language application descriptions into bounded environment...
WorkflowGen:an adaptive workflow generation mechanism driven by trajectory experience
arXiv:2604.19756v1 Announce Type: new
Abstract: Large language model (LLM) agents often suffer from high reasoning overhead, excessive token consumption, unstable execution, and inability to reuse past experiences in complex tasks like business queries, tool use, and workflow orchestration. Traditi...
ParaRNN: Large-Scale Nonlinear RNNs, Trainable in Parallel
Recurrent Neural Networks (RNNs) are naturally suited to efficient inference, requiring far less memory and compute than attention-based architectures, but the sequential nature of their computation has historically made it impractical to scale up RNNs to billions of parameters. A new advancement fr...
AutoAdapt: Automated domain adaptation for large language models
Deploying large language models (LLMs) in real-world, high-stakes settings is harder than it should be. In high-stakes settings like law, medicine, and cloud incident response, performance and reliability can quickly break down because adapting models to domain-specific requirements is a slow and ma...
Quantum inspired qubit qutrit neural networks for real time financial forecasting
arXiv:2604.18838v1 Announce Type: new
Abstract: This research investigates the performance and efficacy of machine learning models in stock prediction, comparing Artificial Neural Networks (ANNs), Quantum Qubit-based Neural Networks (QQBNs), and Quantum Qutrit-based Neural Networks (QQTNs). By outl...
AI scientists produce results without reasoning scientifically
arXiv:2604.18805v1 Announce Type: new
Abstract: Large language model (LLM)-based systems are increasingly deployed to conduct scientific research autonomously, yet whether their reasoning adheres to the epistemic norms that make scientific inquiry self-correcting is poorly understood. Here, we eval...
ARES: Adaptive Red-Teaming and End-to-End Repair of Policy-Reward System
arXiv:2604.18789v1 Announce Type: new
Abstract: Reinforcement Learning from Human Feedback (RLHF) is central to aligning Large Language Models (LLMs), yet it introduces a critical vulnerability: an imperfect Reward Model (RM) can become a single point of failure when it fails to penalize unsafe beh...
Beyond One Output: Visualizing and Comparing Distributions of Language Model Generations
arXiv:2604.18724v1 Announce Type: new
Abstract: Users typically interact with and evaluate language models via single outputs, but each output is just one sample from a broad distribution of possible completions. This interaction hides distributional structure such as modes, uncommon edge cases, an...
Curiosity-Critic: Cumulative Prediction Error Improvement as a Tractable Intrinsic Reward for World Model Training
arXiv:2604.18701v1 Announce Type: new
Abstract: Local prediction-error-based curiosity rewards focus on the current transition without considering the world model's cumulative prediction error across all visited transitions. We introduce Curiosity-Critic, which grounds its intrinsic reward in the i...
FASE : A Fairness-Aware Spatiotemporal Event Graph Framework for Predictive Policing
arXiv:2604.18644v1 Announce Type: new
Abstract: Predictive policing systems that allocate patrol resources based solely on predicted crime risk can unintentionally amplify racial disparities through feedback driven data bias. We present FASE, a Fairness Aware Spatiotemporal Event Graph framework, w...
Easy Samples Are All You Need: Self-Evolving LLMs via Data-Efficient Reinforcement Learning
arXiv:2604.18639v1 Announce Type: new
Abstract: Previous LLMs-based RL studies typically follow either supervised learning with high annotation costs, or unsupervised paradigms using voting or entropy-based rewards. However, their performance remains far from satisfactory due to the substantial ann...
Compile to Compress: Boosting Formal Theorem Provers by Compiler Outputs
arXiv:2604.18587v1 Announce Type: new
Abstract: Large language models (LLMs) have demonstrated significant potential in formal theorem proving, yet state-of-the-art performance often necessitates prohibitive test-time compute via massive roll-outs or extended context windows. In this work, we addre...
As AI agents grow more autonomous, trust can't rely on logs alone. In this this article, I explore how cryptographic techniques — from content-addressed code to tamper-evident audit trails — are laying the groundwork for a new era of verifiable, auditable AI.
Semantic Consensus: Process-Aware Conflict Detection and Resolution for Enterprise Multi-Agent LLM Systems
arXiv:2604.16339v1 Announce Type: new
Abstract: Multi-agent large language model (LLM) systems are rapidly emerging as the dominant architecture for enterprise AI automation, yet production deployments exhibit failure rates between 41% and 86.7%, with nearly 79% of failures originating from specifi...
BASIS: Balanced Activation Sketching with Invariant Scalars for "Ghost Backpropagation"
arXiv:2604.16324v1 Announce Type: new
Abstract: The activation memory required for exact backpropagation scales linearly with network depth, context length, and feature dimensionality, forming an O(L * BN ) spatial bottleneck (where B is the sequence-batch cardinality and N is the feature dimension...
UniMamba: A Unified Spatial-Temporal Modeling Framework with State-Space and Attention Integration
arXiv:2604.16325v1 Announce Type: new
Abstract: Multivariate time series forecasting is fundamental to numerous domains such as energy, finance, and environmental monitoring, where complex temporal dependencies and cross-variable interactions pose enduring challenges. Existing Transformer-based met...
A Discordance-Aware Multimodal Framework with Multi-Agent Clinical Reasoning
arXiv:2604.16333v1 Announce Type: new
Abstract: Knee osteoarthritis frequently exhibits discordance between structural damage observed in imaging and patient-reported symptoms such as pain. This mismatch complicates clinical interpretation and patient stratification and remains insufficiently model...
Governing the Agentic Enterprise: A Governance Maturity Model for Managing AI Agent Sprawl in Business Operations
arXiv:2604.16338v1 Announce Type: new
Abstract: The rapid adoption of agentic AI in enterprise business operations--autonomous systems capable of planning, reasoning, and executing multi-step workflows--has created an urgent governance crisis. Organizations face uncontrolled agent sprawl: the proli...
Heterogeneous Self-Play for Realistic Highway Traffic Simulation
arXiv:2604.16406v1 Announce Type: new
Abstract: Realistic highway simulation is critical for scalable safety evaluation of autonomous vehicles, particularly for interactions that are too rare to study from logged data alone. Yet highway traffic generation remains challenging because it requires bro...
Computational Hermeneutics: Evaluating generative AI as a cultural technology
arXiv:2604.16403v1 Announce Type: new
Abstract: Generative AI systems are increasingly recognized as cultural technologies, yet current evaluation frameworks often treat culture as a variable to be measured rather than fundamental to the system's operation. Drawing on hermeneutic theory from the hu...
Understanding context is key to understanding human language, an ability which Large Language Models (LLMs) have been increasingly seen to demonstrate to an impressive extent. However, though the evaluation of LLMs encompasses various domains within the realm of Natural Language Processing, limited ...