AI Tools & Frameworks Directory

Discover essential tools, libraries, and frameworks to power your AI workflows.

Tool• Apr 9, 2026

$S^3$: Stratified Scaling Search for Test-Time in Diffusion Language Models

arXiv:2604.06260v1 Announce Type: new Abstract: Test-time scaling investigates whether a fixed diffusion language model (DLM) can generate better outputs when given more inference compute, without additional training. However, naive best-of-$K$ sampling is fundamentally limited because it repeatedl...

#ArXiv#Machine Learning#Academic

Tool• Apr 9, 2026

Spectral Edge Dynamics Reveal Functional Modes of Learning

arXiv:2604.06256v1 Announce Type: new Abstract: Training dynamics during grokking concentrate along a small number of dominant update directions -- the spectral edge -- which reliably distinguishes grokking from non-grokking regimes. We show that standard mechanistic interpretability tools (head at...

#ArXiv#Machine Learning#Academic

Tool• Apr 9, 2026

FLeX: Fourier-based Low-rank EXpansion for multilingual transfer

arXiv:2604.06253v1 Announce Type: new Abstract: Cross-lingual code generation is critical in enterprise environments where multiple programming languages coexist. However, fine-tuning large language models (LLMs) individually for each language is computationally prohibitive. This paper investigates...

#ArXiv#Machine Learning#Academic

Tool• Apr 9, 2026

Probabilistic Language Tries: A Unified Framework for Compression, Decision Policies, and Execution Reuse

arXiv:2604.06228v1 Announce Type: new Abstract: We introduce probabilistic language tries (PLTs), a unified representation that makes explicit the prefix structure implicitly defined by any generative model over sequences. By assigning to each outgoing edge the conditional probability of the corres...

#ArXiv#Machine Learning#Academic

Tool• Apr 9, 2026

A Benchmark of Classical and Deep Learning Models for Agricultural Commodity Price Forecasting on A Novel Bangladeshi Market Price Dataset

arXiv:2604.06227v1 Announce Type: new Abstract: Accurate short-term forecasting of agricultural commodity prices is critical for food security planning and smallholder income stabilisation in developing economies, yet machine-learning-ready datasets for this purpose remain scarce in South Asia. Thi...

#ArXiv#Machine Learning#Academic

Tool• Apr 9, 2026

Weakly Supervised Distillation of Hallucination Signals into Transformer Representations

arXiv:2604.06277v1 Announce Type: new Abstract: Existing hallucination detection methods for large language models (LLMs) rely on external verification at inference time, requiring gold answers, retrieval systems, or auxiliary judge models. We ask whether this external supervision can instead be di...

#ArXiv#Machine Learning#Academic

Tool• Apr 9, 2026

Apr 9, 2026PolicyTrustworthy agents in practice

#Anthropic#Claude#LLM

Tool• Apr 9, 2026

LaCy: What Small Language Models Can and Should Learn is Not Just a Question of Loss

This paper was accepted at the Workshop on Memory for LLM-Based Agentic Systems at ICLR. Language models have consistently grown to compress more world knowledge into their parameters, but the knowledge that can be pretrained into them is upper-bounded by their parameter size. Especially the capaci...

#Apple#On-device AI

Tool• Apr 9, 2026

A Theoretical Framework for Acoustic Neighbor Embeddings

This paper provides a theoretical framework for interpreting acoustic neighbor embeddings, which are representations of the phonetic content of variable-width audio or text in a fixed-dimensional embedding space. A probabilistic interpretation of the distances between embeddings is proposed, based o...

#Apple#On-device AI

Tool• Apr 8, 2026

Algebraic Structure Discovery for Real World Combinatorial Optimisation Problems: A General Framework from Abstract Algebra to Quotient Space Learning

arXiv:2604.04941v1 Announce Type: new Abstract: Many combinatorial optimisation problems hide algebraic structures that, once exposed, shrink the search space and improve the chance of finding the global optimal solution. We present a general framework that (i) identifies algebraic structure, (ii) ...

#ArXiv#Machine Learning#Academic

Tool• Apr 8, 2026

Operational Noncommutativity in Sequential Metacognitive Judgments

arXiv:2604.04938v1 Announce Type: new Abstract: Metacognition, understood as the monitoring and regulation of one's own cognitive processes, is inherently sequential: an agent evaluates an internal state, updates it, and may then re-evaluate under modified criteria. Order effects in cognition are w...

#ArXiv#Machine Learning#Academic

Tool• Apr 8, 2026

Pramana: Fine-Tuning Large Language Models for Epistemic Reasoning through Navya-Nyaya

arXiv:2604.04937v1 Announce Type: new Abstract: Large language models produce fluent text but struggle with systematic reasoning, often hallucinating confident but unfounded claims. When Apple researchers added irrelevant context to mathematical problems, LLM performance degraded by 65% Apple Machi...

#ArXiv#Machine Learning#Academic

Tool• Apr 8, 2026

Governance-Aware Agent Telemetry for Closed-Loop Enforcement in Multi-Agent AI Systems

Enterprise multi-agent AI systems produce thousands of inter-agent interactions per hour, yet existing observability tools capture these dependencies without enforcing anything. OpenTelemetry and Langfuse collect telemetry but treat governance as a downstream analytics concern, not a real-time enfor...

#Apple#On-device AI

Tool• Apr 7, 2026

This new chip survives 1300°F (700°C) and could change AI forever

A team of engineers has created a breakthrough memory device that keeps working at temperatures hotter than molten lava, shattering one of electronics’ biggest limits. Built from an unusual stack of ultra-durable materials, the tiny component can store data and perform calculations even at 700°C (13...

#Science Daily#AI#Research

Tool• Apr 7, 2026

Toward Full Autonomous Laboratory Instrumentation Control with Large Language Models

arXiv:2604.03286v1 Announce Type: new Abstract: The control of complex laboratory instrumentation often requires significant programming expertise, creating a barrier for researchers lacking computational skills. This work explores the potential of large language models (LLMs), such as ChatGPT, and...

#ArXiv#Machine Learning#Academic

Tool• Apr 7, 2026

IC3-Evolve: Proof-/Witness-Gated Offline LLM-Driven Heuristic Evolution for IC3 Hardware Model Checking

arXiv:2604.03232v1 Announce Type: new Abstract: IC3, also known as property-directed reachability (PDR), is a commonly-used algorithm for hardware safety model checking. It checks if a state transition system complies with a given safety property. IC3 either returns UNSAFE (indicating property viol...

#ArXiv#Machine Learning#Academic

Tool• Apr 7, 2026

To Throw a Stone with Six Birds: On Agents and Agenthood

arXiv:2604.03239v1 Announce Type: new Abstract: Six Birds Theory (SBT) treats macroscopic objects as induced closures rather than primitives. Empirical discussions of agency often conflate persistence (being an object) with control (making a counterfactual difference), which makes agency claims dif...

#ArXiv#Machine Learning#Academic

Tool• Apr 7, 2026

Position: Science of AI Evaluation Requires Item-level Benchmark Data

arXiv:2604.03244v1 Announce Type: new Abstract: AI evaluations have become the primary evidence for deploying generative AI systems across high-stakes domains. However, current evaluation paradigms often exhibit systemic validity failures. These issues, ranging from unjustified design choices to mi...

#ArXiv#Machine Learning#Academic

Tool• Apr 7, 2026

Integrating Artificial Intelligence, Physics, and Internet of Things: A Framework for Cultural Heritage Conservation

arXiv:2604.03233v1 Announce Type: new Abstract: The conservation of cultural heritage increasingly relies on integrating technological innovation with domain expertise to ensure effective monitoring and predictive maintenance. This paper presents a novel framework to support the preservation of cul...

#ArXiv#Machine Learning#Academic

Tool• Apr 7, 2026

Apparent Age Estimation: Challenges and Outcomes

arXiv:2604.03335v1 Announce Type: new Abstract: Apparent age estimation is a valuable tool for business personalization, yet current models frequently exhibit demographic biases. We review prior works on the DEX method by applying distribution learning techniques such as Mean-Variance Loss (MVL) an...

#ArXiv#Machine Learning#Academic

Tool• Apr 7, 2026

DRAFT: Task Decoupled Latent Reasoning for Agent Safety

arXiv:2604.03242v1 Announce Type: new Abstract: The advent of tool-using LLM agents shifts safety monitoring from output moderation to auditing long, noisy interaction trajectories, where risk-critical evidence is sparse-making standard binary supervision poorly suited for credit assignment. To add...

#ArXiv#Machine Learning#Academic

Tool• Apr 7, 2026

Scaling DPPs for RAG: Density Meets Diversity

arXiv:2604.03240v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs) by grounding generation in external knowledge, yielding relevance responses that are aligned with factual evidence and evolving corpora. Standard RAG pipelines construct contex...

#ArXiv#Machine Learning#Academic

Tool• Apr 6, 2026

LiME: Lightweight Mixture of Experts for Efficient Multimodal Multi-task Learning

arXiv:2604.02338v1 Announce Type: new Abstract: MoE-PEFT methods combine Mixture of Experts with parameter-efficient fine-tuning for multi-task adaptation, but require separate adapters per expert causing trainable parameters to scale linearly with expert count and limiting applicability to adapter...

#ArXiv#Machine Learning#Academic

Tool• Apr 6, 2026

Not All Denoising Steps Are Equal: Model Scheduling for Faster Masked Diffusion Language Models

arXiv:2604.02340v1 Announce Type: new Abstract: Recent advances in masked diffusion language models (MDLMs) narrow the quality gap to autoregressive LMs, but their sampling remains expensive because generation requires many full-sequence denoising passes with a large Transformer and, unlike autoreg...

#ArXiv#Machine Learning#Academic

← Prev

1...4 5 6 7 8...37