Perplexity AI Introduces Hybrid Local-Server Inference Orchestrator for Personal Computer: Automatic On-Device and Cloud Task Routing
Perplexity AI announces a hybrid local-server inference orchestrator for Personal Computer, automatically routing AI tasks between on-device and cloud models.
The post Perplexity AI Introduces Hybrid Local-Server Inference Orchestrator for Personal Computer: Automatic On-Device and Cloud Task Routin...
The Meta hack shows there’s more to AI security than Mythos
On June 5, 404 Media reported that attackers had been using Meta’s AI customer support agent to steal Instagram accounts. Their approach was simple: They asked the agent to link the accounts to email addresses that they controlled, and the agent complied. One attacker broke into the dormant Obama Wh...
Seoul Purpose: How NVIDIA and South Korea Are Building the Future of AI
Home to cutting-edge sovereign AI infrastructure and robotics innovators, as well as one of the world’s most passionate gaming communities, South Korea is one of the world’s centers of AI. NVIDIA founder and CEO Jensen Huang is in Seoul this week to meet the partners and builders behind that work. S...
Scientists are seriously asking if bees and ChatGPT are conscious
New studies suggest consciousness can't be judged solely by behavior, whether it's a chatbot discussing philosophy or a bee searching for nectar. Researchers are increasingly focusing on the internal mechanisms of brains and computers, concluding that today's AI is likely not conscious while leaving...
The Evaluation Blind Spot: A Stereological Theory of Benchmark Coverage for Large Language Models
arXiv:2606.05169v1 Announce Type: new
Abstract: We give a stereological theory of LLM benchmark coverage. For any suite with effective dimensionality d_eff, the visible Hausdorff distance between two convex capability profiles consistent with the same scores is bounded by epsilon + C R m^(-1/(d_eff...
ERRORQUAKE: Heavy-Tailed Error Severity Distributions in Open-Weight Large Language Models
arXiv:2606.05170v1 Announce Type: new
Abstract: At matched accuracy, open-weight LLMs differ substantially in the shape of their error severity distribution -- a difference invisible to the scalar error rate. Hallucination benchmarks report a single error count and treat all errors as equivalent, y...
Staged Factorial Screening for Budget-Constrained Micro-Pretraining
arXiv:2606.05186v1 Announce Type: new
Abstract: Budget-constrained micro-pretraining often requires triaging many candidate recipes on a shared accelerator before larger search budgets are spent. We study whether a staged fractional-factorial workflow can recover stable early effect structure in th...
Temporal Preference Concepts and their Functions in a Large Language Model
arXiv:2606.05194v1 Announce Type: new
Abstract: Large Language Models (LLMs) are increasingly being deployed to make decisions that require trading off near-term gains against long-term consequences, yet little is known about how they internally represent or resolve these tradeoffs. In this work, w...
How Far Did They Go? The Persuasive Tactics of Covert LLM Agents in a Discontinued Field Experiment
arXiv:2606.05256v1 Announce Type: new
Abstract: This study analyzes a publicly released dataset from a discontinued field experiment on Reddit's r/ChangeMyView. The intervention, conducted by unknown, external researchers and halted following ethical backlash, involved undisclosed AI-generated acco...
What Should Agents Say? Action-state Communication for Efficient Multi-Agent Systems
arXiv:2606.05304v1 Announce Type: new
Abstract: Multi-agent systems (MAS) built on large language models are typically organized around roles, pipelines, and turn schedules, while the content that agents pass to one another is often left as unconstrained natural language. However, this free-form co...
I Know What You Meme, Even If it Emerged Today: Understanding Evolving Memes through Open-World Knowledge Acquisition
arXiv:2606.05316v1 Announce Type: new
Abstract: Multimodal memes are dynamic and often require up to date background knowledge for interpretation. Existing methods often overlook such knowledge or rely on fixed parametric knowledge of pretrained models that may be incomplete, outdated, or unavailab...
GITCO: Gated Inference-Time Context Optimization in TSFMs
arXiv:2606.05332v1 Announce Type: new
Abstract: Patch-based Time Series Foundation Models (TSFMs) suffer from context poisoning: structurally anomalous patches capture disproportionate attention and silently degrade zero-shot forecast quality. We propose improving TSFM accuracy at inference time by...
Uncertainty Aware Functional Behavior Prediction and Material Fatigue Assessment for Circular Factory
arXiv:2606.05334v1 Announce Type: new
Abstract: Returned products in circular factories re-enter production with heterogeneous degradation states, usage histories, and remaining capability. Reuse cannot be decided from the current inspection alone, because future function fulfillment and component ...
Building a Semantic Search Engine and Open-Status Classifier over the ResearchMath-14k Dataset
This tutorial walks through a complete NLP pipeline for research-level mathematics. Using the ResearchMath-14k dataset, we extract field-specific keywords with TF-IDF, generate sentence embeddings, visualize the problem landscape with UMAP, cluster with K-Means, build a semantic search engine, and t...
NVIDIA AI Releases Nemotron 3 Ultra: An Open 550B Mixture-of-Experts Hybrid Mamba-Transformer for Long-Running Agents
NVIDIA has released Nemotron 3 Ultra, a 550B total (55B active) open Mixture-of-Experts hybrid Mamba-Transformer for long-running agents. It pairs a 1M-token context with up to ~6x higher inference throughput than comparable open LLMs at on-par accuracy, and ships with open weights, training data, a...
Defense tech, AI, and fundraising take center stage at StrictlyVC Los Angeles on June 18
With just two weeks to go, StrictlyVC Los Angeles is quickly approaching. On Thursday, June 18, at The Aerospace Corporation Campus in El Segundo. Investors, founders, and tech leaders will gather for an evening of conversations exploring some of the most consequential shifts taking place across ven...
Apple approves Poke as the first AI agent on its Messages for Business platform
Poke, the startup that lets people use AI agents through simple text messages, has become the first AI agent approved for Apple’s Messages for Business platform.
Meta rolls out a new AI creator assistant on Facebook
Creators often have to parse through charts and dashboards to understand their performance, but with the new AI assistant, they can get quick answers to questions like "When should I post?" and "What are people saying in my comments?"
Five Ways to Fine-Tune Chronos-2, the Time Series Foundation Model
In Part 1 of this series, we introduced Chronos-2, a time-series foundation model. We got our hands dirty by walking through a real case study and saw what Chronos-2 can do straight out of the box, with no training. But as we noted at the end of Part 1, zero-shot isn’t always enough. In cases […]
Th...