Q&A: MIT SHASS and the future of education in the age of AI
As the School of Humanities, Arts, and Social Sciences marks 75 years, Dean Agustín Rayo reflects on how AI is reshaping higher education and why SHASS disciplines continue to be central to MIT’s mission.
I can’t claim to be a professional software developer—not by a long shot. I occasionally write some Python code to analyze spreadsheets, and I occasionally hack something together on my own, usually related to prime numbers or numerical analysis. But I have to admit that I identify with both of the ...
Bringing people together at AI for the Economy Forum
Woman working with a large tool. On the left are the Google and MIT FutureTech logos, "AI for the Economy Forum", and "Innovation and Adaption in the New Era"
NVIDIA and the University of Maryland Researchers Released Audio Flamingo Next (AF-Next): A Super Powerful and Open Large Audio-Language Model
Understanding audio has always been the multimodal frontier that lags behind vision. While image-language models have rapidly scaled toward real-world deployment, building open models that robustly reason over speech, environmental sounds, and music — especially at length — has remained quite hard. ...
arXiv:2604.09560v1 Announce Type: new
Abstract: Transformers, diffusion-maps, and magnetic Laplacians are usually treated as separate tools; we show they are all different regimes of a single Markov geometry built from pre-softmax query-scores. We define a QK "bidivergence" whose exponentiated and ...
Fairboard: a quantitative framework for equity assessment of healthcare models
arXiv:2604.09656v1 Announce Type: new
Abstract: Despite there now being more than 1,000 FDA-authorised AI medical devices, formal equity assessments -- whether model performance is uniform across patient subgroups -- are rare. Here, we evaluate the equity of 18 open-source brain tumour segmentation...
Deliberative Alignment is Deep, but Uncertainty Remains: Inference time safety improvement in reasoning via attribution of unsafe behavior to base model
arXiv:2604.09665v1 Announce Type: new
Abstract: While the wide adoption of refusal training in large language models (LLMs) has showcased improvements in model safety, recent works have highlighted shortcomings due to the shallow nature of these alignment methods. To this end, the work on Deliberat...
Human-like Working Memory Interference in Large Language Models
arXiv:2604.09670v1 Announce Type: new
Abstract: Intelligent systems must maintain and manipulate task-relevant information online to adapt to dynamic environments and changing goals. This capacity, known as working memory, is fundamental to human reasoning and intelligence. Despite having on the or...
Belief-State RWKV for Reinforcement Learning under Partial Observability
arXiv:2604.09671v1 Announce Type: new
Abstract: We propose a stronger formulation of RL on top of RWKV-style recurrent sequence models, in which the fixed-size recurrent state is explicitly interpreted as a belief state rather than an opaque hidden vector. Instead of conditioning policy and value o...
LABBench2: An Improved Benchmark for AI Systems Performing Biology Research
arXiv:2604.09554v1 Announce Type: new
Abstract: Optimism for accelerating scientific discovery with AI continues to grow. Current applications of AI in scientific research range from training dedicated foundation models on scientific data to agentic autonomous hypothesis generation systems to AI-dr...
arXiv:2604.09563v1 Announce Type: new
Abstract: AI systems produce large volumes of logs as they interact with tools and users. Analysing these logs can help understand model capabilities, propensities, and behaviours, or assess whether an evaluation worked as intended. Researchers have started dev...
Turing Test on Screen: A Benchmark for Mobile GUI Agent Humanization
arXiv:2604.09574v1 Announce Type: new
Abstract: The rise of autonomous GUI agents has triggered adversarial countermeasures from digital platforms, yet existing research prioritizes utility and robustness over the critical dimension of anti-detection. We argue that for agents to survive in human-ce...
AHC: Meta-Learned Adaptive Compression for Continual Object Detection on Memory-Constrained Microcontrollers
arXiv:2604.09576v1 Announce Type: new
Abstract: Deploying continual object detection on microcontrollers (MCUs) with under 100KB memory requires efficient feature compression that can adapt to evolving task distributions. Existing approaches rely on fixed compression strategies (e.g., FiLM conditio...
Quantum‑Accelerated AI: The First Real Break From the Scaling Wall
Quantum optimization is emerging as the first real escape from AI’s scaling limits—accelerating training, reducing compute costs, and redefining frontier models.
Google AI Research Proposes Vantage: An LLM-Based Protocol for Measuring Collaboration, Creativity, and Critical Thinking
Standardized tests can tell you whether a student knows calculus or can parse a passage of text. What they cannot reliably tell you is whether that student can resolve a disagreement with a teammate, generate genuinely original ideas under pressure, or critically dismantle a flawed argument. These a...
Your Model Isn’t Done: Understanding and Fixing Model Drift
How production models fail over time, and how to catch and fix it before it breaks trust.
The post Your Model Isn’t Done: Understanding and Fixing Model Drift appeared first on Towards Data Science.