AI Tools & Frameworks Directory

Discover essential tools, libraries, and frameworks to power your AI workflows.

Tool• Jul 3, 2026

PACE: A Neuro-Symbolic Framework for Plausible and Actionable Counterfactual Explanations

arXiv:2607.01306v1 Announce Type: new Abstract: Counterfactual explanations explain machine learning predictions by identifying minimal input changes that would alter a model's decision. Although many existing methods successfully generate prediction-changing alternatives, they often produce unreal...

#ArXiv#Machine Learning#Academic

Tool• Jul 3, 2026

Auto-FL-Research: Agentic Search for Federated Learning Algorithms

arXiv:2607.01366v1 Announce Type: new Abstract: Federated learning (FL) research often depends on many small but consequential algorithmic choices: optimizer variants, server aggregation rules, local training schedules, normalization, regularization, and model architecture. These choices are expens...

#ArXiv#Machine Learning#Academic

Tool• Jul 3, 2026

The Wiola Architecture for Efficient Small Language Models

arXiv:2607.01394v1 Announce Type: new Abstract: We present Wiola, a fully original Small Language Model (SLM) architecture built from first principles, sharing no structural lineage with any existing model family including GPT, LLaMA, Mistral, or Falcon. Wiola introduces five independently novel co...

#ArXiv#Machine Learning#Academic

Tool• Jul 3, 2026

Agent4cs: A Multi-agent System for Code Summarization in Large Hierarchical Codebases

arXiv:2607.01425v1 Announce Type: new Abstract: Understanding large, complex codebases, especially those with obfuscated structures and incomplete documentation, remains a significant challenge. Existing code summarization solutions often rely on a single language model or coding assistant like Cla...

#ArXiv#Machine Learning#Academic

Tool• Jul 3, 2026

When Should Service Agents Reconsider? Difficulty-Routed Control in Customer-Service Operations

arXiv:2607.01426v1 Announce Type: new Abstract: Autonomous customer-service agents are shifting from conversational interfaces toward operational execution roles: they retrieve firm records, apply service policies, and execute backend writes such as refunds, cancellations, exchanges, order modifica...

#ArXiv#Machine Learning#Academic

Tool• Jul 2, 2026

Representation as a Bottleneck for Mechanistic Interpretability: The Manifestation Unit Protocol

arXiv:2607.00089v1 Announce Type: new Abstract: Mechanistic interpretability has produced a rich inventory of component-level analyses that characterise what neural-network components encode and how they interact. Their outputs, however, are not easily reusable: selectivity tables, circuit diagrams...

#ArXiv#Machine Learning#Academic

Tool• Jul 2, 2026

SNAP-FM: Sparse Nonlinear Accelerated Projection for Physics-Constrained Generative Modeling

arXiv:2607.00095v1 Announce Type: new Abstract: Generative models have emerged as scalable surrogates for physical simulation, yet they offer no guarantee that their outputs respect the conservation laws, boundary conditions, and nonlinear invariants that govern the underlying physics. Constrained ...

#ArXiv#Machine Learning#Academic

Tool• Jul 2, 2026

A Filtered Mixture-of-Generators for Fully Synthetic Survival Training

arXiv:2607.00127v1 Announce Type: new Abstract: Survival analysis models time-to-event data, but in clinical settings training data are costly and scarce: events accrue over years of follow-up, cohorts are small, and privacy regulations restrict sharing across institutions. Tabular generative model...

#ArXiv#Machine Learning#Academic

Tool• Jul 2, 2026

GRPO, Dr. GRPO, and DAPO Are Three Operations on One Number: The Group-Standard-Deviation Identity

arXiv:2607.00152v1 Announce Type: new Abstract: Three of the most popular methods for training language models to reason look like three different tricks. They are not. All three adjust a single number: standard deviation, reflecting how much a prompt's sampled answers disagree. When such a model i...

#ArXiv#Machine Learning#Academic

Tool• Jul 2, 2026

Bounded Morality: Defining the Space of Moral Computation

arXiv:2607.00002v1 Announce Type: new Abstract: Moral cognition has traditionally been modeled as adherence to fixed ethical theories--deontology, consequentialism, virtue ethics--implemented as static rules or value functions. We propose Bounded Morality, a formal framework for analyzing the compu...

#ArXiv#Machine Learning#Academic

Tool• Jul 2, 2026

The MMM Data Model -- A Normative Specification for Knowledge Interoperability in a Decentralisable Knowledge Commons

arXiv:2607.00032v1 Announce Type: new Abstract: Many information systems are built around documents: self-contained units optimised for print production and linear reading. While effective for large-scale dissemination, the document-centric organisation constrains how knowledge can be structured, u...

#ArXiv#Machine Learning#Academic

Tool• Jul 2, 2026

Making Failure Safe: A Constrained, Verifiable Agent Framework for Open-Web Data Collection

arXiv:2607.00035v1 Announce Type: new Abstract: LLMs and agents can generate web scrapers from natural-language requirements, but direct generation remains unreliable because of dependency errors, broken selectors, schema mismatches, and heterogeneous page structures. We propose a constrained, veri...

#ArXiv#Machine Learning#Academic

Tool• Jul 2, 2026

Multi-Agent Teams Hold Experts Back

Multi-agent LLM systems are increasingly deployed as autonomous collaborators, where agents interact freely rather than execute fixed, pre-specified workflows. In such settings, effective coordination cannot be fully designed in advance and must instead emerge through interaction. However, most prio...

#Apple#On-device AI

Tool• Jul 2, 2026

VideoFlexTok: Flexible-Length Coarse-to-Fine Video Tokenization

Visual tokenizers map high-dimensional raw pixels into a compressed representation for downstream modeling. Beyond compression, tokenizers dictate what information is preserved and how it is organized. A de facto standard approach to video tokenization is to represent a video as a spatiotemporal 3D ...

#Apple#On-device AI

Tool• Jul 2, 2026

Amortizing Maximum Inner Product Search with Learned Support Functions

Maximum inner product search (MIPS) is a crucial subroutine in machine learning, requiring the identification of a vector taken within a database (the keys) that best aligns with a given query. We propose amortized MIPS: a regression-based approach that trains neural networks to directly predict MIP...

#Apple#On-device AI

Tool• Jul 2, 2026

On Robustness and Chain-of-Thought Consistency of RL-Finetuned VLMs

Reinforcement learning (RL) finetuning has become a key technique for enhancing large language models (LLMs) on reasoning-intensive tasks, motivating its extension to vision language models (VLMs). While RL-tuned VLMs improve on visual reasoning benchmarks, they remain vulnerable to weak visual grou...

#Apple#On-device AI

Tool• Jul 2, 2026

MemoryLLM: Plug-n-Play Interpretable Feed-Forward Memory for Transformers

Understanding how transformer components operate in LLMs is important, as it is at the core of recent technological advances in artificial intelligence. In this work, we revisit the challenges associated with interpretability of feed-forward modules (FFNs) and propose MemoryLLM, which aims to decoup...

#Apple#On-device AI

Tool• Jul 1, 2026

Accelerometry-Derived Digital Biomarkers for Cardiometabolic Risk: A Population-Representative Tabular Benchmark with Uncertainty Quantification

arXiv:2606.30702v1 Announce Type: new Abstract: Structured tabular data dominates clinical medicine, yet existing benchmarks fail to reflect real-world properties like complex survey sampling, demographic oversampling, and subgroup fairness. We introduce the NHANES Accelerometry Cardiometabolic Ben...

#ArXiv#Machine Learning#Academic

Tool• Jul 1, 2026

From Search to Synthesis: Training LLMs as Zero-Shot Workflow Generators

arXiv:2606.30704v1 Announce Type: new Abstract: Large language models (LLMs) excel across a wide range of tasks, yet their instance-specific solutions often lack the structural consistency needed for reliable deployment. Workflows that encode recurring algorithmic patterns at the task level provide...

#ArXiv#Machine Learning#Academic

Tool• Jul 1, 2026

What Drives Interactive Improvement from Feedback?

arXiv:2606.30774v1 Announce Type: new Abstract: We study when natural-language feedback produces improvement beyond the gains obtainable from repeated attempts alone. In multi-turn language agent setting, higher final accuracy can reflect useful feedback, but it can also arise from resampling, form...

#ArXiv#Machine Learning#Academic

Tool• Jul 1, 2026

Contrastive Reflection for Iterative Prompt Optimization

arXiv:2606.30840v1 Announce Type: new Abstract: LLM agents are becoming central to information retrieval: they issue retrieval queries, synthesize answers, and increasingly serve as judges for IR evaluation. Improving the prompts that control these agents is an optimization problem, but in applied ...

#ArXiv#Machine Learning#Academic

Tool• Jul 1, 2026

BayesBench: Evaluating LLM Belief Trajectories Under Multi-Turn Evidence Accumulation

arXiv:2606.30850v1 Announce Type: new Abstract: Large language models (LLMs) are typically deployed in multi-turn conversations, where each turn provides new evidence that should reduce epistemic uncertainty about their environment. Acting rationally then requires inferring the unobserved quantitie...

#ArXiv#Machine Learning#Academic

Tool• Jun 30, 2026

SkillOpt: Agent skills as trainable parameters

AI agents often fail because their instructions, or skills, are manually modified with no guarantee of improvement. Learn how SkillOpt turns skill editing into a training process, making agent behavior more reliable without changing model weights. The post SkillOpt: Agent skills as trainable paramet...

#Microsoft#Research

Tool• Jun 30, 2026

Start building with Nano Banana 2 Lite and Gemini Omni Flash

#DeepMind#Google#AGI

← Prev

1...6 7 8 9 10...63