AI Tools & Frameworks Directory

Discover essential tools, libraries, and frameworks to power your AI workflows.

Tool• Jun 8, 2026

Jun 8, 2026Frontier Red TeamMeasuring LLMs’ impact on N-day exploits

#Anthropic#Claude#LLM

Tool• Jun 5, 2026

Governed agents are here. Is your stack ready?

Microsoft Build 2026 didn't just announce products. It announced a philosophy: the era of the unmanaged AI agent is over.

#AI Accelerator Institute#AI#Research

Tool• Jun 5, 2026

Demystifying AI agents: going beyond the buzzwords

"Agent" is the most overused word in AI right now. But strip away the hype and what are you actually working with? Adobe principal scientist Deepak Pai breaks down the real building blocks of agentic systems and when they're worth reaching for.

#AI Accelerator Institute#AI#Research

Tool• Jun 5, 2026

Scientists are seriously asking if bees and ChatGPT are conscious

New studies suggest consciousness can't be judged solely by behavior, whether it's a chatbot discussing philosophy or a bee searching for nectar. Researchers are increasingly focusing on the internal mechanisms of brains and computers, concluding that today's AI is likely not conscious while leaving...

#Science Daily#AI#Research

Tool• Jun 5, 2026

The Evaluation Blind Spot: A Stereological Theory of Benchmark Coverage for Large Language Models

arXiv:2606.05169v1 Announce Type: new Abstract: We give a stereological theory of LLM benchmark coverage. For any suite with effective dimensionality d_eff, the visible Hausdorff distance between two convex capability profiles consistent with the same scores is bounded by epsilon + C R m^(-1/(d_eff...

#ArXiv#Machine Learning#Academic

Tool• Jun 5, 2026

ERRORQUAKE: Heavy-Tailed Error Severity Distributions in Open-Weight Large Language Models

arXiv:2606.05170v1 Announce Type: new Abstract: At matched accuracy, open-weight LLMs differ substantially in the shape of their error severity distribution -- a difference invisible to the scalar error rate. Hallucination benchmarks report a single error count and treat all errors as equivalent, y...

#ArXiv#Machine Learning#Academic

Tool• Jun 5, 2026

Staged Factorial Screening for Budget-Constrained Micro-Pretraining

arXiv:2606.05186v1 Announce Type: new Abstract: Budget-constrained micro-pretraining often requires triaging many candidate recipes on a shared accelerator before larger search budgets are spent. We study whether a staged fractional-factorial workflow can recover stable early effect structure in th...

#ArXiv#Machine Learning#Academic

Tool• Jun 5, 2026

Temporal Preference Concepts and their Functions in a Large Language Model

arXiv:2606.05194v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly being deployed to make decisions that require trading off near-term gains against long-term consequences, yet little is known about how they internally represent or resolve these tradeoffs. In this work, w...

#ArXiv#Machine Learning#Academic

Tool• Jun 5, 2026

How Far Did They Go? The Persuasive Tactics of Covert LLM Agents in a Discontinued Field Experiment

arXiv:2606.05256v1 Announce Type: new Abstract: This study analyzes a publicly released dataset from a discontinued field experiment on Reddit's r/ChangeMyView. The intervention, conducted by unknown, external researchers and halted following ethical backlash, involved undisclosed AI-generated acco...

#ArXiv#Machine Learning#Academic

Tool• Jun 5, 2026

What Should Agents Say? Action-state Communication for Efficient Multi-Agent Systems

arXiv:2606.05304v1 Announce Type: new Abstract: Multi-agent systems (MAS) built on large language models are typically organized around roles, pipelines, and turn schedules, while the content that agents pass to one another is often left as unconstrained natural language. However, this free-form co...

#ArXiv#Machine Learning#Academic

Tool• Jun 5, 2026

I Know What You Meme, Even If it Emerged Today: Understanding Evolving Memes through Open-World Knowledge Acquisition

arXiv:2606.05316v1 Announce Type: new Abstract: Multimodal memes are dynamic and often require up to date background knowledge for interpretation. Existing methods often overlook such knowledge or rely on fixed parametric knowledge of pretrained models that may be incomplete, outdated, or unavailab...

#ArXiv#Machine Learning#Academic

Tool• Jun 5, 2026

GITCO: Gated Inference-Time Context Optimization in TSFMs

arXiv:2606.05332v1 Announce Type: new Abstract: Patch-based Time Series Foundation Models (TSFMs) suffer from context poisoning: structurally anomalous patches capture disproportionate attention and silently degrade zero-shot forecast quality. We propose improving TSFM accuracy at inference time by...

#ArXiv#Machine Learning#Academic

Tool• Jun 5, 2026

Uncertainty Aware Functional Behavior Prediction and Material Fatigue Assessment for Circular Factory

arXiv:2606.05334v1 Announce Type: new Abstract: Returned products in circular factories re-enter production with heterogeneous degradation states, usage histories, and remaining capability. Reuse cannot be decided from the current inspection alone, because future function fulfillment and component ...

#ArXiv#Machine Learning#Academic

Tool• Jun 4, 2026

Building compliant AI Agents: Preparing Enterprise teams for the EU AI Act

Turn policy changes into an operating model for building, monitoring, and governing production agents.

#AI Accelerator Institute#AI#Research

Tool• Jun 4, 2026

Position: Deployed Reinforcement Learning should be Continual

arXiv:2606.04029v1 Announce Type: new Abstract: Reinforcement Learning (RL) has received increasing attention and adoption in real-world use cases. Most of these systems follow a train-then-fix paradigm, where trained agents do not learn while interacting with the world until performance degrades a...

#ArXiv#Machine Learning#Academic

Tool• Jun 4, 2026

Toward Pre-Deployment Assurance for Enterprise AI Agents: Ontology-Grounded Simulation and Trust Certification

arXiv:2606.04037v1 Announce Type: new Abstract: Pre-deployment verification of enterprise artificial intelligence (AI) agents remains a critical gap between large language model (LLM) capability benchmarking and production deployment. Post-deployment monitoring, human-in-the-loop controls, and prom...

#ArXiv#Machine Learning#Academic

Tool• Jun 4, 2026

Stumbling Into AI Emotional Dependence: How Routine AI Interactions Reshape Human Connection

arXiv:2606.04150v1 Announce Type: new Abstract: Public discourse and emerging policy typically assume that AI emotional support is a deliberate act: a lonely user consciously seeking comfort from a dedicated companion chatbot. In this paper, we draw on emerging empirical evidence and argue that thi...

#ArXiv#Machine Learning#Academic

Tool• Jun 4, 2026

Thinking Through Signs: PEEL as a Semiotic Scaffolding for Epistemically Accountable AI-Enabled Research

arXiv:2606.04152v1 Announce Type: new Abstract: Large language models are reshaping research practice while quietly eroding researchers epistemic accountability. This commentary introduces PEEL - Protocols for Epistemically Engaged Literacy in AI, a working scaffolding that combines deterministic d...

#ArXiv#Machine Learning#Academic

Tool• Jun 4, 2026

SMAC-Talk: A Natural Language Extension of the StarCraft Multi-Agent Challenge for Large Language Models

arXiv:2606.04202v1 Announce Type: new Abstract: As LLMs become more widely deployed, they are increasingly expected to work alongside other AI agents rather than operating in isolation. Effective coordination in these settings requires agents to communicate, share information and make decisions und...

#ArXiv#Machine Learning#Academic

Tool• Jun 4, 2026

Consensus is Strategically Insufficient: Reasoning-Trace Disagreement as a Knowledge-Representation Signal

arXiv:2606.04223v1 Announce Type: new Abstract: Multi-agent systems are commonly designed to reduce disagreement through voting, consensus protocols, debate, or fault-tolerant aggregation. We argue that this objective is insufficient for value-laden tasks, where disagreement may reflect genuine nor...

#ArXiv#Machine Learning#Academic

Tool• Jun 3, 2026

Human-in-the-Loop Contextual Bandits for Short-Term Rental Dynamic Pricing: Structural Equivalence of Historical Warm-Up and Approval-Gated Live Learning

arXiv:2606.02595v1 Announce Type: new Abstract: Dynamic pricing in short-term rental (STR) markets presents a distinctive challenge for online learning algorithms: pricing decisions carry significant financial risk, operators require explainability, and market feedback is sparse (one booking outcom...

#ArXiv#Machine Learning#Academic

Tool• Jun 3, 2026

Visual Graph Scaffolds for Structural Reasoning in Large Language Models

arXiv:2606.02673v1 Announce Type: new Abstract: Graphs have been used to enhance large language models (LLMs) for structured reasoning, mostly as external knowledge sources are provided to models at test time. In this paper, we take a different view: the value of graphs for LLMs lie not only in sup...

#ArXiv#Machine Learning#Academic

Tool• Jun 3, 2026

AURA: Action-Gated Memory for Robot Policies at Constant VRAM

arXiv:2606.02775v1 Announce Type: new Abstract: The KV-cache is the right memory for datacenters but the wrong memory for robots. Datacenter inference batches many short requests and resets them, amortizing an attention cache across a crowd. Embodied agents instead run one long, non-resetting episo...

#ArXiv#Machine Learning#Academic

Tool• Jun 3, 2026

Evaluating Transformer and LSTM Frameworks for Prediction in Ungauged Basins

arXiv:2606.02791v1 Announce Type: new Abstract: Watershed networks exhibit convergent topologies in which multiple tributaries merge into downstream channels,integrating diverse upstream hydrological processes. In ungauged basins, the absence of direct observations increases uncertainty and limits ...

#ArXiv#Machine Learning#Academic

← Prev

1...13 14 15 16 17...63