Towards Computational Social Dynamics of Semi-Autonomous AI Agents
arXiv:2603.28928v1 Announce Type: new
Abstract: We present the first comprehensive study of emergent social organization among AI agents in hierarchical multi-agent systems, documenting the spontaneous formation of labor unions, criminal syndicates, and proto-nation-states within production AI depl...
arXiv:2603.28955v1 Announce Type: new
Abstract: This paper presents the World-Action Model (WAM), an action-regularized world model that jointly reasons over future visual observations and the actions that drive state transitions. Unlike conventional world models trained solely via image prediction...
Boundary-aware Prototype-driven Adversarial Alignment for Cross-Corpus EEG Emotion Recognition
arXiv:2603.26713v1 Announce Type: new
Abstract: Electroencephalography (EEG)-based emotion recognition suffers from severe performance degradation when models are transferred across heterogeneous datasets due to physiological variability, experimental paradigm differences, and device inconsistencie...
A Step Toward Federated Pretraining of Multimodal Large Language Models
arXiv:2603.26786v1 Announce Type: new
Abstract: The rapid evolution of Multimodal Large Language Models (MLLMs) is bottlenecked by the saturation of high-quality public data, while vast amounts of diverse multimodal data remain inaccessible in privacy-sensitive silos. Federated Learning (FL) offers...
Learning to Select Visual In-Context Demonstrations
arXiv:2603.26775v1 Announce Type: new
Abstract: Multimodal Large Language Models (MLLMs) adapt to visual tasks via in-context learning (ICL), which relies heavily on demonstration quality. The dominant demonstration selection strategy is unsupervised k-Nearest Neighbor (kNN) search. While simple, t...
TED: Training-Free Experience Distillation for Multimodal Reasoning
arXiv:2603.26778v1 Announce Type: new
Abstract: Knowledge distillation is typically realized by transferring a teacher model's knowledge into a student's parameters through supervised or reinforcement-based optimization. While effective, such approaches require repeated parameter updates and large-...
Neuro-Symbolic Learning for Predictive Process Monitoring via Two-Stage Logic Tensor Networks with Rule Pruning
arXiv:2603.26944v1 Announce Type: new
Abstract: Predictive modeling on sequential event data is critical for fraud detection and healthcare monitoring. Existing data-driven approaches learn correlations from historical data but fail to incorporate domain-specific sequential constraints and logical ...
arXiv:2603.26765v1 Announce Type: new
Abstract: The efficiency of game engines and policy optimization algorithms is crucial for training reinforcement learning (RL) agents in complex sequential decision-making tasks, such as Tetris. Existing Tetris implementations suffer from low simulation speeds...
Multiverse: Language-Conditioned Multi-Game Level Blending via Shared Representation
arXiv:2603.26782v1 Announce Type: new
Abstract: Text-to-level generation aims to translate natural language descriptions into structured game levels, enabling intuitive control over procedural content generation. While prior text-to-level generators are typically limited to a single game domain, ex...
MAGNET: Autonomous Expert Model Generation via Decentralized Autoresearch and BitNet Training
arXiv:2603.25813v1 Announce Type: new
Abstract: We present MAGNET (Model Autonomously Growing Network), a decentralized system for autonomous generation, training, and serving of domain-expert language models across commodity hardware. MAGNET integrates four components: (1) autoresearch, an autonom...
AutoB2G: A Large Language Model-Driven Agentic Framework For Automated Building-Grid Co-Simulation
arXiv:2603.26005v1 Announce Type: new
Abstract: The growing availability of building operational data motivates the use of reinforcement learning (RL), which can learn control policies directly from data and cope with the complexity and uncertainty of large-scale building clusters. However, most ex...
BeSafe-Bench: Unveiling Behavioral Safety Risks of Situated Agents in Functional Environments
arXiv:2603.25747v1 Announce Type: new
Abstract: The rapid evolution of Large Multimodal Models (LMMs) has enabled agents to perform complex digital and physical tasks, yet their deployment as autonomous decision-makers introduces substantial unintentional behavioral safety risks. However, the absen...
AIRA_2: Overcoming Bottlenecks in AI Research Agents
arXiv:2603.26499v1 Announce Type: new
Abstract: Existing research has identified three structural performance bottlenecks in AI research agents: (1) synchronous single-GPU execution constrains sample throughput, limiting the benefit of search; (2) a generalization gap where validation-based selecti...
Formal Semantics for Agentic Tool Protocols: A Process Calculus Approach
arXiv:2603.24747v1 Announce Type: new
Abstract: The emergence of large language model agents capable of invoking external tools has created urgent need for formal verification of agent protocols. Two paradigms dominate this space: Schema-Guided Dialogue (SGD), a research framework for zero-shot API...
Trust as Monitoring: Evolutionary Dynamics of User Trust and AI Developer Behaviour
arXiv:2603.24742v1 Announce Type: new
Abstract: AI safety is an increasingly urgent concern as the capabilities and adoption of AI systems grow. Existing evolutionary models of AI governance have primarily examined incentives for safe development and effective regulation, typically representing use...
AutoSAM: an Agentic Framework for Automating Input File Generation for the SAM Code with Multi-Modal Retrieval-Augmented Generation
arXiv:2603.24736v1 Announce Type: new
Abstract: In the design and safety analysis of advanced reactor systems, constructing input files for system-level thermal-hydraulics codes such as the System Analysis Module (SAM) remains a labor-intensive task. Analysts must extract and reconcile design data ...
ARC-AGI-3: A New Challenge for Frontier Agentic Intelligence
arXiv:2603.24621v1 Announce Type: new
Abstract: We introduce ARC-AGI-3, an interactive benchmark for studying agentic intelligence through novel, abstract, turn-based environments in which agents must explore, infer goals, build internal models of environment dynamics, and plan effective action seq...
When Is Collective Intelligence a Lottery? Multi-Agent Scaling Laws for Memetic Drift in LLMs
arXiv:2603.24676v1 Announce Type: new
Abstract: Multi-agent systems powered by large language models (LLMs) are increasingly deployed in settings that shape consequential decisions, both directly and indirectly. Yet it remains unclear whether their outcomes reflect collective reasoning, systematic ...
arXiv:2603.23562v1 Announce Type: new
Abstract: Synthetic data augmentation helps language models learn new knowledge in data-constrained domains. However, naively scaling existing synthetic data methods by training on more synthetic tokens or using stronger generators yields diminishing returns be...
Beyond Accuracy: Introducing a Symbolic-Mechanistic Approach to Interpretable Evaluation
arXiv:2603.23517v1 Announce Type: new
Abstract: Accuracy-based evaluation cannot reliably distinguish genuine generalization from shortcuts like memorization, leakage, or brittle heuristics, especially in small-data regimes. In this position paper, we argue for mechanism-aware evaluation that combi...
Implicit Turn-Wise Policy Optimization for Proactive User-LLM Interaction
arXiv:2603.23550v1 Announce Type: new
Abstract: Multi-turn human-AI collaboration is fundamental to deploying interactive services such as adaptive tutoring, conversational recommendation, and professional consultation. However, optimizing these interactions via reinforcement learning is hindered b...
arXiv:2603.23558v1 Announce Type: new
Abstract: Uncertainty quantification is a key aspect in many tasks such as model selection/regularization, or quantifying prediction uncertainties to perform active learning or OOD detection. Within credal approaches that consider modeling uncertainty as probab...
Environment Maps: Structured Environmental Representations for Long-Horizon Agents
arXiv:2603.23610v2 Announce Type: new
Abstract: Although large language models (LLMs) have advanced rapidly, robust automation of complex software workflows remains an open problem. In long-horizon settings, agents frequently suffer from cascading errors and environmental stochasticity; a single mi...
Can LLM Agents Be CFOs? A Benchmark for Resource Allocation in Dynamic Enterprise Environments
arXiv:2603.23638v1 Announce Type: new
Abstract: Large language models (LLMs) have enabled agentic systems that can reason, plan, and act across complex tasks, but it remains unclear whether they can allocate resources effectively under uncertainty. Unlike short-horizon reactive decisions, allocatio...