One Mask to Rule Them All: On Hidden Facts after Editing and How to Find Them
arXiv:2605.28839v1 Announce Type: new
Abstract: Knowledge editing methods such as ROME and MEMIT update factual associations in transformer models by modifying MLP weights. While evaluated mainly by output behavior, their internal mechanism remains underexplored. We investigate whether edits rely o...
Why LLMs Fail at Causal Discovery and How Interventional Agents Escape
arXiv:2605.27567v1 Announce Type: new
Abstract: Causal discovery is a cornerstone of scientific reasoning, yet whether large language models can perform it reliably remains an open question. Recent benchmarks show that even fine-tuned models plateau on simple causal graphs and degrade as complexity...
On the Origin of Synthetic Information by Means of Steganographic Inheritance
arXiv:2605.27551v1 Announce Type: new
Abstract: The origin of species has been the mystery of mysteries in natural science. By analogy, the origin of synthetic information, we suggest, is the mystery of mysteries in information science. The question carries a moral weight that a technical account c...
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2026
Apple is presenting new research at the annual IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), which takes place in person in Denver at the Colorado Convention Center from June 3 to June 7.
We are proud to sponsor the conference, which brings together the scientific and indust...
SilIF: Silhouette-Augmented Isolation Forest for Unsupervised Transaction Fraud Detection
arXiv:2605.26135v1 Announce Type: new
Abstract: Unsupervised anomaly detection is widely used in transaction fraud detection where labels are scarce. Isolation Forest (IF) is among the most popular classical methods due to its scalability and ease of deployment. We propose SilIF, an augmentation of...
AI agents keep breaking in production. Here's why nobody's fixed it yet.
The gap between what agentic AI promises and what it actually delivers in live environments is now one of the most consequential engineering problems in the industry. It is also, frustratingly, one that the field has been slow to name precisely, let alone fix...
CAFD: Concept-Aware DNN Fault Detection using VLMs
arXiv:2605.24008v1 Announce Type: new
Abstract: Fault detection for Deep Neural Networks (DNNs) has received increasing attention in recent years. While more advanced hybrid approaches have been proposed to combine multiple sources of information and outperform earlier techniques, they often incur ...
FusionSense: Tri-Stage Near-Sensor Learning for Runtime-Adaptive Multimodal Edge Intelligence
arXiv:2605.22868v1 Announce Type: new
Abstract: Autonomous systems and smart-industry deployments increasingly split computation across near-sensor, edge, and cloud resources, where tight energy, latency, and reliability budgets demand run-time adaptivity. In practice, deciding what to compute and ...
TO-Agents: A Multi-Agent AI Pipeline for Preference-Guided Topology Optimization
arXiv:2605.21622v1 Announce Type: new
Abstract: Topology optimization can generate efficient structures, but designers often must manually translate qualitative intent, such as desired visual style, product experience, or manufacturability into solver settings that are not directly tied to those pr...
The 3 reasons your AI never makes it to production
Most companies don't have an AI problem. They have a throughput problem. And I think that distinction matters a lot when you start talking about how to actually get AI working in production.
Double descent for least-squares interpolation on contaminated data: A simulation study
arXiv:2605.21494v1 Announce Type: new
Abstract: Overparametrized models can exhibit an excellent generalization performance, although they should be prone to overfitting according to classical statistical theory. The discovery of the "double descent", indicating that the generalization error decrea...
Don't Collapse Your Features: Why CenterLoss Hurts OOD Detection and Multi-Scale Mahalanobis Wins
arXiv:2605.21493v1 Announce Type: new
Abstract: The ability to detect out-of-distribution (OOD) inputs is fundamental to safe deployment of machine learning systems. Yet, current methods often rely on feature representations that are optimised solely for classification accuracy, neglecting the dist...
VSAS-Bench: Real-Time Evaluation of Visual Streaming Assistant Models
Streaming vision-language models (VLMs) continuously generate responses given an instruction prompt and an online stream of input frames. This is a core mechanism for real-time visual assistants. Existing VLM frameworks predominantly assess models in offline settings. In contrast, the performance of...
Vega: Zero-knowledge proofs for digital identity in the age of AI
Vega turns a full credential into a single proof, sharing only what is needed and nothing more, with performance that works in real apps.
The post Vega: Zero-knowledge proofs for digital identity in the age of AI appeared first on Microsoft Research.
arXiv:2605.20467v1 Announce Type: new
Abstract: Neural networks can be trained to rank the choices made by logical reasoners, resulting in more efficient searches for answers. A key step in this process is creating useful embeddings, i.e., numeric representations of logical statements. This paper i...
OSCToM: RL-Guided Adversarial Generation for High-Order Theory of Mind
arXiv:2605.20423v1 Announce Type: new
Abstract: Large Language Models (LLMs) perform well on many language tasks, but their Theory of Mind (ToM) reasoning is still uneven in complex social settings. Existing benchmarks, including ExploreToM, do not always test the recursive beliefs and information ...
MagBridge-Battery: A Synthetic Bridge Dataset for Li-ion Magnetometry and State-of-Health Diagnostics
arXiv:2605.20240v1 Announce Type: new
Abstract: Battery health diagnostics today rely overwhelmingly on electrochemical signals measured at the cell terminals. A parallel literature has shown that magnetic sensing can resolve information that terminal-only measurements miss, but method development ...
AI agents keep breaking in production. Here's why nobody's fixed it yet
78% of enterprises have an AI agent pilot running. Only 14% have successfully scaled one. The gap isn't a model problem. It's an engineering one (and it's hiding in plain sight....)
Forget electrons, this breakthrough uses light-matter particles to power AI
Researchers at Penn have created a hybrid light-matter particle that could dramatically speed up AI computing while using far less energy. The breakthrough may help replace some electronic computing processes with ultra-efficient light-based technology.
SkillSmith: Compiling Agent Skills into Boundary-Guided Runtime Interfaces
arXiv:2605.15215v1 Announce Type: new
Abstract: Recently, skills have been widely adopted in large language model (LLM)-based agent systems across various domains. In existing frameworks, skills are typically injected into the agent reasoning loop as contextual guidance once matched to a runtime ta...