AI Tools & Frameworks Directory

Discover essential tools, libraries, and frameworks to power your AI workflows.

Tool• Mar 18, 2026

Steering Frozen LLMs: Adaptive Social Alignment via Online Prompt Routing

arXiv:2603.15647v1 Announce Type: new Abstract: Large language models (LLMs) are typically governed by post-training alignment (e.g., RLHF or DPO), which yields a largely static policy during deployment and inference. However, real-world safety is a full-lifecycle problem: static defenses degrade a...

#ArXiv#Machine Learning#Academic

Tool• Mar 18, 2026

Neural-Symbolic Logic Query Answering in Non-Euclidean Space

arXiv:2603.15633v1 Announce Type: new Abstract: Answering complex first-order logic (FOL) queries on knowledge graphs is essential for reasoning. Symbolic methods offer interpretability but struggle with incomplete graphs, while neural approaches generalize better but lack transparency. Neural-symb...

#ArXiv#Machine Learning#Academic

Tool• Mar 18, 2026

NextMem: Towards Latent Factual Memory for LLM-based Agents

arXiv:2603.15634v1 Announce Type: new Abstract: Memory is critical for LLM-based agents to preserve past observations for future decision-making, where factual memory serves as its foundational part. However, existing approaches to constructing factual memory face several limitations. Textual metho...

#ArXiv#Machine Learning#Academic

Tool• Mar 18, 2026

AIDABench: AI Data Analytics Benchmark

arXiv:2603.15636v1 Announce Type: new Abstract: As AI-driven document understanding and processing tools become increasingly prevalent in real-world applications, the need for rigorous evaluation standards has grown increasingly urgent. Existing benchmarks and evaluations often focus on isolated ca...

#ArXiv#Machine Learning#Academic

Tool• Mar 18, 2026

The Comprehension-Gated Agent Economy: A Robustness-First Architecture for AI Economic Agency

arXiv:2603.15639v1 Announce Type: new Abstract: AI agents are increasingly granted economic agency (executing trades, managing budgets, negotiating contracts, and spawning sub-agents), yet current frameworks gate this agency on capability benchmarks that are empirically uncorrelated with operationa...

#ArXiv#Machine Learning#Academic

Tool• Mar 18, 2026

Form Follows Function: Recursive Stem Model

arXiv:2603.15641v1 Announce Type: new Abstract: Recursive reasoning models such as Hierarchical Reasoning Model (HRM) and Tiny Recursive Model (TRM) show that small, weight-shared networks can solve compute-heavy and NP puzzles by iteratively refining latent states, but their training typically rel...

#ArXiv#Machine Learning#Academic

Tool• Mar 18, 2026

Prose2Policy (P2P): A Practical LLM Pipeline for Translating Natural-Language Access Policies into Executable Rego

Prose2Policy (P2P) is a LLM-based practical tool that translates natural-language access control policies (NLACPs) into executable Rego code (the policy language of Open Policy Agent, OPA). It provides a modular, end-to-end pipeline that performs policy detection, component extraction, schema valida...

#Apple#On-device AI

Tool• Mar 18, 2026

Goldilocks RL: Tuning Task Difficulty to Escape Sparse Rewards for Reasoning

Reinforcement learning has emerged as a powerful paradigm for unlocking reasoning capabilities in large language models. However, relying on sparse rewards makes this process highly sample-inefficient, as models must navigate vast search spaces with minimal feedback. While classic curriculum learnin...

#Apple#On-device AI

Tool• Mar 17, 2026

Measuring progress toward AGI: A cognitive framework

We’re introducing a framework to measure progress toward AGI, and launching a Kaggle hackathon to build the relevant evaluations.

#DeepMind#Google#AGI

Tool• Mar 17, 2026

Translational Gaps in Graph Transformers for Longitudinal EHR Prediction: A Critical Appraisal of GT-BEHRT

arXiv:2603.13231v1 Announce Type: new Abstract: Transformer-based models have improved predictive modeling on longitudinal electronic health records through large-scale self-supervised pretraining. However, most EHR transformer architectures treat each clinical encounter as an unordered collection ...

#ArXiv#Machine Learning#Academic

Tool• Mar 17, 2026

Your Code Agent Can Grow Alongside You with Structured Memory

arXiv:2603.13258v1 Announce Type: new Abstract: While "Intent-oriented programming" (or "Vibe Coding") redefines software engineering, existing code agents remain tethered to static code snapshots. Consequently, they struggle to model the critical information embedded in the temporal evolution of p...

#ArXiv#Machine Learning#Academic

Tool• Mar 17, 2026

A Dual-Path Generative Framework for Zero-Day Fraud Detection in Banking Systems

arXiv:2603.13237v1 Announce Type: new Abstract: High-frequency banking environments face a critical trade-off between low-latency fraud detection and the regulatory explainability demanded by GDPR. Traditional rule-based and discriminative models struggle with "zero-day" attacks due to extreme clas...

#ArXiv#Machine Learning#Academic

Tool• Mar 17, 2026

Benchmarking Zero-Shot Reasoning Approaches for Error Detection in Solidity Smart Contracts

arXiv:2603.13239v1 Announce Type: new Abstract: Smart contracts play a central role in blockchain systems by encoding financial and operational logic. Still, their susceptibility to subtle security flaws poses significant risks of financial loss and erosion of trust. LLMs create new opportunities f...

#ArXiv#Machine Learning#Academic

Tool• Mar 17, 2026

Think First, Diffuse Fast: Improving Diffusion Language Model Reasoning via Autoregressive Plan Conditioning

arXiv:2603.13243v1 Announce Type: new Abstract: Diffusion large language models (dLLMs) generate text via iterative denoising but consistently underperform on multi-step reasoning. We hypothesize this gap stems from a coordination problem: AR models build coherence token-by-token, while diffusion m...

#ArXiv#Machine Learning#Academic

Tool• Mar 17, 2026

Automating Document Intelligence in Statutory City Planning

arXiv:2603.13245v1 Announce Type: new Abstract: UK planning authorities face a legislative conflict between the Planning Act, which mandates public access to application documents, and the Data Protection Act, which requires protection of personal information. This situation creates a manually inte...

#ArXiv#Machine Learning#Academic

Tool• Mar 17, 2026

TrajTok: Learning Trajectory Tokens enables better Video Understanding

Tokenization in video models, typically through patchification, generates an excessive and redundant number of tokens. This severely limits video efficiency and scalability. While recent trajectory-based tokenizers offer a promising solution by decoupling video duration from token count, they rely o...

#Apple#On-device AI

Tool• Mar 16, 2026

From Garbage to Gold: A Data-Architectural Theory of Predictive Robustness

arXiv:2603.12288v1 Announce Type: new Abstract: Tabular machine learning presents a paradox: modern models achieve state-of-the-art performance using high-dimensional (high-D), collinear, error-prone data, defying the "Garbage In, Garbage Out" mantra. To help resolve this, we synthesize principles ...

#ArXiv#Machine Learning#Academic

Tool• Mar 16, 2026

Multi-objective Genetic Programming with Multi-view Multi-level Feature for Enhanced Protein Secondary Structure Prediction

arXiv:2603.12293v1 Announce Type: new Abstract: Predicting protein secondary structure is essential for understanding protein function and advancing drug discovery. However, the intricate sequence-structure relationship poses significant challenges for accurate modeling. To address these, we propos...

#ArXiv#Machine Learning#Academic

Tool• Mar 16, 2026

Global Evolutionary Steering: Refining Activation Steering Control via Cross-Layer Consistency

arXiv:2603.12298v1 Announce Type: new Abstract: Activation engineering enables precise control over Large Language Models (LLMs) without the computational cost of fine-tuning. However, existing methods deriving vectors from static activation differences are susceptible to high-dimensional noise and...

#ArXiv#Machine Learning#Academic

Tool• Mar 16, 2026

Efficient Reasoning with Balanced Thinking

arXiv:2603.12372v1 Announce Type: new Abstract: Large Reasoning Models (LRMs) have shown remarkable reasoning capabilities, yet they often suffer from overthinking, expending redundant computational steps on simple problems, or underthinking, failing to explore sufficient reasoning paths despite in...

#ArXiv#Machine Learning#Academic

Tool• Mar 16, 2026

Generating Expressive and Customizable Evals for Timeseries Data Analysis Agents with AgentFuel

arXiv:2603.12483v1 Announce Type: new Abstract: Across many domains (e.g., IoT, observability, telecommunications, cybersecurity), there is an emerging adoption of conversational data analysis agents that enable users to "talk to your data" to extract insights. Such data analysis agents operate on ...

#ArXiv#Machine Learning#Academic

Tool• Mar 16, 2026

AI Planning Framework for LLM-Based Web Agents

arXiv:2603.12710v1 Announce Type: new Abstract: Developing autonomous agents for web-based tasks is a core challenge in AI. While Large Language Model (LLM) agents can interpret complex user requests, they often operate as black boxes, making it difficult to diagnose why they fail or how they plan....

#ArXiv#Machine Learning#Academic

Tool• Mar 16, 2026

Scientists discover AI can make humans more creative

Artificial intelligence is often portrayed as a tool that replaces human work, but new research from Swansea University suggests a far more exciting role: creative collaborator. In a large study with more than 800 participants designing virtual cars, researchers found that AI-generated design galler...

#Science Daily#AI#Research

Tool• Mar 13, 2026

Unlocking the power of data: How we built text-to-SQL with agentic RAG at Rocket Mortgage

Your company’s data holds answers, but accessing them is often the hard part. Here’s how Rocket Mortgage built a text-to-SQL system with agentic RAG to make data accessible to everyone.

#AI Accelerator Institute#AI#Research

← Prev

1...10 11 12 13 14...37