AI Tools & Frameworks Directory

Discover essential tools, libraries, and frameworks to power your AI workflows.

Tool• Mar 17, 2026

Benchmarking Zero-Shot Reasoning Approaches for Error Detection in Solidity Smart Contracts

arXiv:2603.13239v1 Announce Type: new Abstract: Smart contracts play a central role in blockchain systems by encoding financial and operational logic. Still, their susceptibility to subtle security flaws poses significant risks of financial loss and erosion of trust. LLMs create new opportunities f...

#ArXiv#Machine Learning#Academic

Tool• Mar 17, 2026

Think First, Diffuse Fast: Improving Diffusion Language Model Reasoning via Autoregressive Plan Conditioning

arXiv:2603.13243v1 Announce Type: new Abstract: Diffusion large language models (dLLMs) generate text via iterative denoising but consistently underperform on multi-step reasoning. We hypothesize this gap stems from a coordination problem: AR models build coherence token-by-token, while diffusion m...

#ArXiv#Machine Learning#Academic

Tool• Mar 17, 2026

Automating Document Intelligence in Statutory City Planning

arXiv:2603.13245v1 Announce Type: new Abstract: UK planning authorities face a legislative conflict between the Planning Act, which mandates public access to application documents, and the Data Protection Act, which requires protection of personal information. This situation creates a manually inte...

#ArXiv#Machine Learning#Academic

Tool• Mar 17, 2026

TrajTok: Learning Trajectory Tokens enables better Video Understanding

Tokenization in video models, typically through patchification, generates an excessive and redundant number of tokens. This severely limits video efficiency and scalability. While recent trajectory-based tokenizers offer a promising solution by decoupling video duration from token count, they rely o...

#Apple#On-device AI

Tool• Mar 16, 2026

From Garbage to Gold: A Data-Architectural Theory of Predictive Robustness

arXiv:2603.12288v1 Announce Type: new Abstract: Tabular machine learning presents a paradox: modern models achieve state-of-the-art performance using high-dimensional (high-D), collinear, error-prone data, defying the "Garbage In, Garbage Out" mantra. To help resolve this, we synthesize principles ...

#ArXiv#Machine Learning#Academic

Tool• Mar 16, 2026

Multi-objective Genetic Programming with Multi-view Multi-level Feature for Enhanced Protein Secondary Structure Prediction

arXiv:2603.12293v1 Announce Type: new Abstract: Predicting protein secondary structure is essential for understanding protein function and advancing drug discovery. However, the intricate sequence-structure relationship poses significant challenges for accurate modeling. To address these, we propos...

#ArXiv#Machine Learning#Academic

Tool• Mar 16, 2026

Global Evolutionary Steering: Refining Activation Steering Control via Cross-Layer Consistency

arXiv:2603.12298v1 Announce Type: new Abstract: Activation engineering enables precise control over Large Language Models (LLMs) without the computational cost of fine-tuning. However, existing methods deriving vectors from static activation differences are susceptible to high-dimensional noise and...

#ArXiv#Machine Learning#Academic

Tool• Mar 16, 2026

Efficient Reasoning with Balanced Thinking

arXiv:2603.12372v1 Announce Type: new Abstract: Large Reasoning Models (LRMs) have shown remarkable reasoning capabilities, yet they often suffer from overthinking, expending redundant computational steps on simple problems, or underthinking, failing to explore sufficient reasoning paths despite in...

#ArXiv#Machine Learning#Academic

Tool• Mar 16, 2026

Generating Expressive and Customizable Evals for Timeseries Data Analysis Agents with AgentFuel

arXiv:2603.12483v1 Announce Type: new Abstract: Across many domains (e.g., IoT, observability, telecommunications, cybersecurity), there is an emerging adoption of conversational data analysis agents that enable users to "talk to your data" to extract insights. Such data analysis agents operate on ...

#ArXiv#Machine Learning#Academic

Tool• Mar 16, 2026

AI Planning Framework for LLM-Based Web Agents

arXiv:2603.12710v1 Announce Type: new Abstract: Developing autonomous agents for web-based tasks is a core challenge in AI. While Large Language Model (LLM) agents can interpret complex user requests, they often operate as black boxes, making it difficult to diagnose why they fail or how they plan....

#ArXiv#Machine Learning#Academic

Tool• Mar 16, 2026

Scientists discover AI can make humans more creative

Artificial intelligence is often portrayed as a tool that replaces human work, but new research from Swansea University suggests a far more exciting role: creative collaborator. In a large study with more than 800 participants designing virtual cars, researchers found that AI-generated design galler...

#Science Daily#AI#Research

Tool• Mar 13, 2026

Unlocking the power of data: How we built text-to-SQL with agentic RAG at Rocket Mortgage

Your company’s data holds answers, but accessing them is often the hard part. Here’s how Rocket Mortgage built a text-to-SQL system with agentic RAG to make data accessible to everyone.

#AI Accelerator Institute#AI#Research

Tool• Mar 13, 2026

Identifying Interactions at Scale for LLMs

--> Understanding the behavior of complex machine learning systems, particularly Large Language Models (LLMs), is a critical challenge in modern artificial intelligence. Interpretability research aims to make the decision-making process more transparent to model builders and impacted humans, a ...

#Berkeley#BAIR#Academic

Tool• Mar 13, 2026

Scientists built the hardest AI test ever and the results are surprising

As AI systems began acing traditional tests, researchers realized those benchmarks were no longer tough enough. In response, nearly 1,000 experts created Humanity’s Last Exam, a massive 2,500-question challenge covering highly specialized topics across many fields. The exam was engineered so that an...

#Science Daily#AI#Research

Tool• Mar 13, 2026

Interventional Time Series Priors for Causal Foundation Models

arXiv:2603.11090v1 Announce Type: new Abstract: Prior-data fitted networks (PFNs) have emerged as powerful foundation models for tabular causal inference, yet their extension to time series remains limited by the absence of synthetic data generators that provide interventional targets. Existing tim...

#ArXiv#Machine Learning#Academic

Tool• Mar 13, 2026

Graph Tokenization for Bridging Graphs and Transformers

arXiv:2603.11099v1 Announce Type: new Abstract: The success of large pretrained Transformers is closely tied to tokenizers, which convert raw input into discrete symbols. Extending these models to graph-structured data remains a significant challenge. In this work, we introduce a graph tokenization...

#ArXiv#Machine Learning#Academic

Tool• Mar 13, 2026

DIVE: Scaling Diversity in Agentic Task Synthesis for Generalizable Tool Use

arXiv:2603.11076v1 Announce Type: new Abstract: Recent work synthesizes agentic tasks for post-training tool-using LLMs, yet robust generalization under shifts in tasks and toolsets remains an open challenge. We trace this brittleness to insufficient diversity in synthesized tasks. Scaling diversit...

#ArXiv#Machine Learning#Academic

Tool• Mar 13, 2026

A Survey of Reasoning in Autonomous Driving Systems: Open Challenges and Emerging Paradigms

arXiv:2603.11093v1 Announce Type: new Abstract: The development of high-level autonomous driving (AD) is shifting from perception-centric limitations to a more fundamental bottleneck, namely, a deficit in robust and generalizable reasoning. Although current AD systems manage structured environments...

#ArXiv#Machine Learning#Academic

Tool• Mar 13, 2026

PACED: Distillation at the Frontier of Student Competence

arXiv:2603.11178v1 Announce Type: new Abstract: Standard LLM distillation wastes compute on two fronts: problems the student has already mastered (near-zero gradients) and problems far beyond its reach (incoherent gradients that erode existing capabilities). We show that this waste is not merely in...

#ArXiv#Machine Learning#Academic

Tool• Mar 13, 2026

Measuring AI Agents' Progress on Multi-Step Cyber Attack Scenarios

arXiv:2603.11214v1 Announce Type: new Abstract: We evaluate the autonomous cyber-attack capabilities of frontier AI models on two purpose-built cyber ranges-a 32-step corporate network attack and a 7-step industrial control system attack-that require chaining heterogeneous capabilities across exten...

#ArXiv#Machine Learning#Academic

Tool• Mar 13, 2026

Reversible Lifelong Model Editing via Semantic Routing-Based LoRA

arXiv:2603.11239v1 Announce Type: new Abstract: The dynamic evolution of real-world necessitates model editing within Large Language Models. While existing methods explore modular isolation or parameter-efficient strategies, they still suffer from semantic drift or knowledge forgetting due to conti...

#ArXiv#Machine Learning#Academic

Tool• Mar 13, 2026

mAceReason-Math: A Dataset of High-Quality Multilingual Math Problems Ready For RLVR

Reinforcement Learning with Verifiable Rewards (RLVR) has been successfully applied to significantly boost the capabilities of pretrained large language models, especially in the math and logic problem domains. However, current research and available training datasets remain English-centric. While m...

#Apple#On-device AI

Tool• Mar 13, 2026

Multilingual Reasoning Gym: Multilingual Scaling of Procedural Reasoning Environments

We present the Multilingual Reasoning Gym, an extension of Reasoning Gym (Stojanovski et al., 2025), that procedurally generates verifiable reasoning problems across 14 languages. We translate templates for 94 tasks with native-speaker validation in 10 languages and targeted code or template adaptat...

#Apple#On-device AI

Tool• Mar 12, 2026

Meta buys Moltbook: The social network where AI agents talk to each other

Meta’s acquisition of Moltbook highlights a growing focus on agent-to-agent systems and the infrastructure required to support them. It’s a small deal that signals bigger shifts in how AI ecosystems may evolve.

#AI Accelerator Institute#AI#Research

← Prev

1...36 37 38 39 40...63