AI Tools & Frameworks Directory

Discover essential tools, libraries, and frameworks to power your AI workflows.

Tool• Apr 13, 2026

Cram Less to Fit More: Training Data Pruning Improves Memorization of Facts

This paper was accepted at the Workshop on Navigating and Addressing Data Problems for Foundation Models at ICLR 2026. Large language models (LLMs) can struggle to memorize factual knowledge in their parameters, often leading to hallucinations and poor performance on knowledge-intensive tasks. In th...

#Apple#On-device AI

Tool• Apr 10, 2026

Rapid prototyping with GenAI: From idea to interactive PoC in days

We’ve been promised a world where one prompt builds perfect software. But the real limitation isn’t the tech, it’s everything AI can’t see.

#AI Accelerator Institute#AI#Research

Tool• Apr 10, 2026

This new chip could slash data center energy waste

A new chip design from UC San Diego could make data centers far more energy-efficient by rethinking how power is converted for GPUs. By combining vibrating piezoelectric components with a clever circuit layout, the system overcomes limitations of traditional designs. The prototype achieved impressiv...

#Science Daily#AI#Research

Tool• Apr 10, 2026

Prediction Arena: Benchmarking AI Models on Real-World Prediction Markets

arXiv:2604.07355v1 Announce Type: new Abstract: We introduce Prediction Arena, a benchmark for evaluating AI models' predictive accuracy and decision-making by enabling them to trade autonomously on live prediction markets with real capital. Unlike synthetic benchmarks, Prediction Arena tests model...

#ArXiv#Machine Learning#Academic

Tool• Apr 10, 2026

BLEG: LLM Functions as Powerful fMRI Graph-Enhancer for Brain Network Analysis

arXiv:2604.07361v1 Announce Type: new Abstract: Graph Neural Networks (GNNs) have been widely used in diverse brain network analysis tasks based on preprocessed functional magnetic resonance imaging (fMRI) data. However, their performances are constrained due to high feature sparsity and inherent l...

#ArXiv#Machine Learning#Academic

Tool• Apr 10, 2026

LLM-Generated Fault Scenarios for Evaluating Perception-Driven Lane Following in Autonomous Edge Systems

arXiv:2604.07362v1 Announce Type: new Abstract: Deploying autonomous vision systems on edge devices faces a critical challenge: resource constraints prevent real-time and predictable execution of comprehensive safety tests. Existing validation methods depend on static datasets or manual fault injec...

#ArXiv#Machine Learning#Academic

Tool• Apr 10, 2026

Benchmark Shadows: Data Alignment, Parameter Footprints, and Generalization in Large Language Models

arXiv:2604.07363v1 Announce Type: new Abstract: Large language models often achieve strong benchmark gains without corresponding improvements in broader capability. We hypothesize that this discrepancy arises from differences in training regimes induced by data distribution. To investigate this, we...

#ArXiv#Machine Learning#Academic

Tool• Apr 9, 2026

New Future of Work: AI is driving rapid change, uneven benefits

For the past five years, the New Future of Work report has captured how work is changing. This year, the shift feels especially sharp. Previous editions have focused on technology’s role in increasing productivity by automating tasks, accelerating communication, and expanding access to information, ...

#Microsoft#Research

Tool• Apr 9, 2026

Ideas: Steering AI toward the work future we want

Microsoft Chief Scientist Jaime Teevan and researchers Jenna Butler, Jake Hofman, and Rebecca Janssen unpack the New Future of Work Report 2025 and explore the ideal AI-driven working world. Plus, is AI a tool or a collaborator? And why the answer matters. The post Ideas: Steering AI toward the work...

#Microsoft#Research

Tool• Apr 9, 2026

A Benchmark of Classical and Deep Learning Models for Agricultural Commodity Price Forecasting on A Novel Bangladeshi Market Price Dataset

arXiv:2604.06227v1 Announce Type: new Abstract: Accurate short-term forecasting of agricultural commodity prices is critical for food security planning and smallholder income stabilisation in developing economies, yet machine-learning-ready datasets for this purpose remain scarce in South Asia. Thi...

#ArXiv#Machine Learning#Academic

Tool• Apr 9, 2026

Probabilistic Language Tries: A Unified Framework for Compression, Decision Policies, and Execution Reuse

arXiv:2604.06228v1 Announce Type: new Abstract: We introduce probabilistic language tries (PLTs), a unified representation that makes explicit the prefix structure implicitly defined by any generative model over sequences. By assigning to each outgoing edge the conditional probability of the corres...

#ArXiv#Machine Learning#Academic

Tool• Apr 9, 2026

FLeX: Fourier-based Low-rank EXpansion for multilingual transfer

arXiv:2604.06253v1 Announce Type: new Abstract: Cross-lingual code generation is critical in enterprise environments where multiple programming languages coexist. However, fine-tuning large language models (LLMs) individually for each language is computationally prohibitive. This paper investigates...

#ArXiv#Machine Learning#Academic

Tool• Apr 9, 2026

Spectral Edge Dynamics Reveal Functional Modes of Learning

arXiv:2604.06256v1 Announce Type: new Abstract: Training dynamics during grokking concentrate along a small number of dominant update directions -- the spectral edge -- which reliably distinguishes grokking from non-grokking regimes. We show that standard mechanistic interpretability tools (head at...

#ArXiv#Machine Learning#Academic

Tool• Apr 9, 2026

$S^3$: Stratified Scaling Search for Test-Time in Diffusion Language Models

arXiv:2604.06260v1 Announce Type: new Abstract: Test-time scaling investigates whether a fixed diffusion language model (DLM) can generate better outputs when given more inference compute, without additional training. However, naive best-of-$K$ sampling is fundamentally limited because it repeatedl...

#ArXiv#Machine Learning#Academic

Tool• Apr 9, 2026

Blind Refusal: Language Models Refuse to Help Users Evade Unjust, Absurd, and Illegitimate Rules

arXiv:2604.06233v1 Announce Type: new Abstract: Safety-trained language models routinely refuse requests for help circumventing rules. But not all rules deserve compliance. When users ask for help evading rules imposed by an illegitimate authority, rules that are deeply unjust or absurd in their co...

#ArXiv#Machine Learning#Academic

Tool• Apr 9, 2026

Toward Reducing Unproductive Container Moves: Predicting Service Requirements and Dwell Times

arXiv:2604.06251v1 Announce Type: new Abstract: This article presents the results of a data science study conducted at a container terminal, aimed at reducing unproductive container moves through the prediction of service requirements and container dwell times. We develop and evaluate machine learn...

#ArXiv#Machine Learning#Academic

Tool• Apr 9, 2026

Weakly Supervised Distillation of Hallucination Signals into Transformer Representations

arXiv:2604.06277v1 Announce Type: new Abstract: Existing hallucination detection methods for large language models (LLMs) rely on external verification at inference time, requiring gold answers, retrieval systems, or auxiliary judge models. We ask whether this external supervision can instead be di...

#ArXiv#Machine Learning#Academic

Tool• Apr 9, 2026

A Theoretical Framework for Acoustic Neighbor Embeddings

This paper provides a theoretical framework for interpreting acoustic neighbor embeddings, which are representations of the phonetic content of variable-width audio or text in a fixed-dimensional embedding space. A probabilistic interpretation of the distances between embeddings is proposed, based o...

#Apple#On-device AI

Tool• Apr 9, 2026

LaCy: What Small Language Models Can and Should Learn is Not Just a Question of Loss

This paper was accepted at the Workshop on Memory for LLM-Based Agentic Systems at ICLR. Language models have consistently grown to compress more world knowledge into their parameters, but the knowledge that can be pretrained into them is upper-bounded by their parameter size. Especially the capaci...

#Apple#On-device AI

Tool• Apr 9, 2026

Apr 9, 2026PolicyTrustworthy agents in practice

#Anthropic#Claude#LLM

Tool• Apr 8, 2026

Pramana: Fine-Tuning Large Language Models for Epistemic Reasoning through Navya-Nyaya

arXiv:2604.04937v1 Announce Type: new Abstract: Large language models produce fluent text but struggle with systematic reasoning, often hallucinating confident but unfounded claims. When Apple researchers added irrelevant context to mathematical problems, LLM performance degraded by 65% Apple Machi...

#ArXiv#Machine Learning#Academic

Tool• Apr 8, 2026

Operational Noncommutativity in Sequential Metacognitive Judgments

arXiv:2604.04938v1 Announce Type: new Abstract: Metacognition, understood as the monitoring and regulation of one's own cognitive processes, is inherently sequential: an agent evaluates an internal state, updates it, and may then re-evaluate under modified criteria. Order effects in cognition are w...

#ArXiv#Machine Learning#Academic

Tool• Apr 8, 2026

Algebraic Structure Discovery for Real World Combinatorial Optimisation Problems: A General Framework from Abstract Algebra to Quotient Space Learning

arXiv:2604.04941v1 Announce Type: new Abstract: Many combinatorial optimisation problems hide algebraic structures that, once exposed, shrink the search space and improve the chance of finding the global optimal solution. We present a general framework that (i) identifies algebraic structure, (ii) ...

#ArXiv#Machine Learning#Academic

Tool• Apr 8, 2026

Governance-Aware Agent Telemetry for Closed-Loop Enforcement in Multi-Agent AI Systems

Enterprise multi-agent AI systems produce thousands of inter-agent interactions per hour, yet existing observability tools capture these dependencies without enforcing anything. OpenTelemetry and Langfuse collect telemetry but treat governance as a downstream analytics concern, not a real-time enfor...

#Apple#On-device AI

← Prev

1...29 30 31 32 33...63