Sequential KV Cache Compression via Probabilistic Language Tries: Beyond the Per-Vector Shannon Limit
arXiv:2604.15356v1 Announce Type: new
Abstract: Recent work on KV cache quantization, culminating in TurboQuant, has approached the Shannon entropy limit for per-vector compression of transformer key-value caches. We observe that this limit applies to a strictly weaker problem than the one that act...
Aletheia: Gradient-Guided Layer Selection for Efficient LoRA Fine-Tuning Across Architectures
arXiv:2604.15351v1 Announce Type: new
Abstract: Low-Rank Adaptation (LoRA) has become the dominant parameter-efficient fine-tuning method for large language models, yet standard practice applies LoRA adapters uniformly to all transformer layers regardless of their relevance to the downstream task. ...
Artificial neurons successfully communicate with living brain cells
Engineers at Northwestern University have taken a striking leap toward merging machines with the human brain by printing artificial neurons that can actually communicate with real ones. These flexible, low-cost devices generate lifelike electrical signals capable of activating living brain cells, a ...
Shapley Value-Guided Adaptive Ensemble Learning for Explainable Financial Fraud Detection with U.S. Regulatory Compliance Validation
arXiv:2604.14231v1 Announce Type: new
Abstract: Financial crime costs U.S. institutions over $32 billion each year. Although AI tools for fraud detection have become more advanced, their use in real-world systems still faces a major obstacle: many of these models operate as black boxes that cannot ...
Towards Verified and Targeted Explanations through Formal Methods
arXiv:2604.14209v1 Announce Type: new
Abstract: As deep neural networks are deployed in safety-critical domains such as autonomous driving and medical diagnosis, stakeholders need explanations that are interpretable but also trustworthy with formal guarantees. Existing XAI methods fall short: heuri...
The Devil Is in Gradient Entanglement: Energy-Aware Gradient Coordinator for Robust Generalized Category Discovery
arXiv:2604.14176v1 Announce Type: new
Abstract: Generalized Category Discovery (GCD) leverages labeled data to categorize unlabeled samples from known or unknown classes. Most previous methods jointly optimize supervised and unsupervised objectives and achieve promising results. However, inherent o...
International Conference on Learning Representations (ICLR) 2026
Apple is presenting new research at the annual International Conference on Learning Representations (ICLR), which takes place in person in Rio de Janeiro, Brazil, from April 23 to 27. We are proud to again sponsor the conference, which brings together the scientific and industrial research communiti...
AI safety shifts from the model to the system level. As AI becomes agentic and tool-driven, risk emerges from complex interactions, widening the gap between evaluation and real-world behavior.
Spectral Entropy Collapse as an Empirical Signature of Delayed Generalisation in Grokking
arXiv:2604.13123v1 Announce Type: new
Abstract: Grokking -- delayed generalisation long after memorisation -- lacks a predictive mechanistic explanation. We identify the normalised spectral entropy $\tilde{H}(t)$ of the representation covariance as a scalar order parameter for this transition, vali...
The Long-Horizon Task Mirage? Diagnosing Where and Why Agentic Systems Break
arXiv:2604.11978v1 Announce Type: new
Abstract: Large language model (LLM) agents perform strongly on short- and mid-horizon tasks, but often break down on long-horizon tasks that require extended, interdependent action sequences. Despite rapid progress in agentic systems, these long-horizon failur...
The Non-Optimality of Scientific Knowledge: Path Dependence, Lock-In, and The Local Minimum Trap
arXiv:2604.11828v1 Announce Type: new
Abstract: Science is widely regarded as humanity's most reliable method for uncovering truths about the natural world. Yet the \emph{trajectory} of scientific discovery is rarely examined as an optimization problem in its own right. This paper argues that the b...
arXiv:2604.11838v1 Announce Type: new
Abstract: While critical for alignment, Supervised Fine-Tuning (SFT) incurs the risk of catastrophic forgetting, yet the layer-wise emergence of instruction-following capabilities remains elusive. We investigate this mechanism via a comprehensive analysis utili...
Linear Programming for Multi-Criteria Assessment with Cardinal and Ordinal Data: A Pessimistic Virtual Gap Analysis
arXiv:2604.09555v1 Announce Type: new
Abstract: Multi-criteria Analysis (MCA) is used to rank alternatives based on various criteria. Key MCA methods, such as Multiple Criteria Decision Making (MCDM) methods, estimate parameters for criteria to compute the performance of each alternative. Nonethele...
Memory-Guided Trust-Region Bayesian Optimization (MG-TuRBO) for High Dimensions
arXiv:2604.08569v1 Announce Type: new
Abstract: Traffic simulation and digital-twin calibration is a challenging optimization problem with a limited simulation budget. Each trial requires an expensive simulation run, and the relationship between calibration inputs and model error is often nonconvex...
Sustained Impact of Agentic Personalisation in Marketing: A Longitudinal Case Study
arXiv:2604.08621v1 Announce Type: new
Abstract: In consumer applications, Customer Relationship Management (CRM) has traditionally relied on the manual optimisation of static, rule-based messaging strategies. While adaptive and autonomous learning systems offer the promise of scalable personalisati...
Inside the AI Index: 12 Takeaways from the 2026 Report
The annual report reveals a field hitting breakthrough capabilities while raising urgent questions about environmental costs, transparency, and who benefits from the technology.
AI is splitting in two directions. One path is controlled, restricted, and security-first. The other is open, autonomous, and scaling fast. The real question isn’t which is better, it’s what this means for trust.
We call it machine learning. But do machines actually learn?
Today's AI systems train, optimize, and scale, but real learning is something else entirely. The distinction matters more than the industry wants to admit.
Flow Learners for PDEs: Toward a Physics-to-Physics Paradigm for Scientific Computing
arXiv:2604.07366v1 Announce Type: new
Abstract: Partial differential equations (PDEs) govern nearly every physical process in science and engineering, yet solving them at scale remains prohibitively expensive. Generative AI has transformed language, vision, and protein science, but learned PDE solv...
Apple is presenting new research at the annual ACM (Association of Computing Machinery) CHI Conference on Human Factors in Computing Systems, which takes place in person in Barcelona, Spain, from April 13 to 17. We are proud to again sponsor the conference, which brings together the scientific and i...
SymptomWise: A Deterministic Reasoning Layer for Reliable and Efficient AI Systems
arXiv:2604.06375v1 Announce Type: new
Abstract: AI-driven symptom analysis systems face persistent challenges in reliability, interpretability, and hallucination. End-to-end generative approaches often lack traceability and may produce unsupported or inconsistent diagnostic outputs in safety-critic...
High-Precision Estimation of the State-Space Complexity of Shogi via the Monte Carlo Method
arXiv:2604.06189v1 Announce Type: new
Abstract: Determining the state-space complexity of the game of Shogi (Japanese Chess) has been a challenging problem, with previous combinatorial estimates leaving a gap of five orders of magnitude ($10^{64}$ to $10^{69}$). This large gap arises from the diffi...
ReVEL: Multi-Turn Reflective LLM-Guided Heuristic Evolution via Structured Performance Feedback
arXiv:2604.04940v1 Announce Type: new
Abstract: Designing effective heuristics for NP-hard combinatorial optimization problems remains a challenging and expertise-intensive task. Existing applications of large language models (LLMs) primarily rely on one-shot code synthesis, yielding brittle heuris...