Stay ahead of the generative AI revolution!Join the M5B Newsletter →

AI Tools & Frameworks Directory

Discover essential tools, libraries, and frameworks to power your AI workflows.

All Engineering Hardware Jobs News Research Tools Tutorials

News AI TechCrunch Analytics Vidhya Data Science Towards Data Science Medium GenAI Textual OpenAI Google MIT Microsoft HuggingFace OpenSource Models NVIDIA GPU Enterprise ArXiv

Tool• Apr 27, 2026

An Artifact-based Agent Framework for Adaptive and Reproducible Medical Image Processing

arXiv:2604.21936v1 Announce Type: new Abstract: Medical imaging research is increasingly shifting from controlled benchmark evaluation toward real-world clinical deployment. In such settings, applying analytical methods extends beyond model design to require dataset-aware workflow configuration and...

#ArXiv#Machine Learning#Academic

Tool• Apr 27, 2026

MolClaw: An Autonomous Agent with Hierarchical Skills for Drug Molecule Evaluation, Screening, and Optimization

arXiv:2604.21937v1 Announce Type: new Abstract: Computational drug discovery, particularly the complex workflows of drug molecule screening and optimization, requires orchestrating dozens of specialized tools in multi-step workflows, yet current AI agents struggle to maintain robust performance and...

#ArXiv#Machine Learning#Academic

Tool• Apr 27, 2026

Math Takes Two: A test for emergent mathematical reasoning in communication

arXiv:2604.21935v1 Announce Type: new Abstract: Although language models demonstrate remarkable proficiency on mathematical benchmarks, it remains unclear whether this reflects true mathematical reasoning or statistical pattern matching over learning formal syntax. Most existing evaluations rely on...

#ArXiv#Machine Learning#Academic

Tool• Apr 27, 2026

Read the Paper, Write the Code: Agentic Reproduction of Social-Science Results

arXiv:2604.21965v1 Announce Type: new Abstract: Recent work has used LLM agents to reproduce empirical social science results with access to both the data and code. We broaden this scope by asking: Can they reproduce results given only a paper's methods description and original data? We develop an ...

#ArXiv#Machine Learning#Academic

Tool• Apr 25, 2026

Escaping the Agreement Trap: Defensibility Signals for Evaluating Rule-Governed AI

arXiv:2604.20972v1 Announce Type: new Abstract: Content moderation systems are typically evaluated by measuring agreement with human labels. In rule-governed environments this assumption fails: multiple decisions may be logically consistent with the governing policy, and agreement metrics penalize ...

#ArXiv#Machine Learning#Academic

Tool• Apr 25, 2026

Value-Conflict Diagnostics Reveal Widespread Alignment Faking in Language Models

arXiv:2604.20995v1 Announce Type: new Abstract: Alignment faking, where a model behaves aligned with developer policy when monitored but reverts to its own preferences when unobserved, is a concerning yet poorly understood phenomenon, in part because current diagnostic tools remain limited. Prior d...

#ArXiv#Machine Learning#Academic

Tool• Apr 25, 2026

Co-Evolving LLM Decision and Skill Bank Agents for Long-Horizon Tasks

arXiv:2604.20987v1 Announce Type: new Abstract: Long horizon interactive environments are a testbed for evaluating agents skill usage abilities. These environments demand multi step reasoning, the chaining of multiple skills over many timesteps, and robust decision making under delayed rewards and ...

#ArXiv#Machine Learning#Academic

Tool• Apr 25, 2026

The Last Harness You'll Ever Build

arXiv:2604.21003v1 Announce Type: new Abstract: AI agents are increasingly deployed on complex, domain-specific workflows -- navigating enterprise web applications that require dozens of clicks and form fills, orchestrating multi-step research pipelines that span search, extraction, and synthesis, ...

#ArXiv#Machine Learning#Academic

Tool• Apr 24, 2026

AI to Learn 2.0: A Deliverable-Oriented Governance Framework and Maturity Rubric for Opaque AI in Learning-Intensive Domains

arXiv:2604.19751v1 Announce Type: new Abstract: Generative AI is entering research, education, and professional work faster than current governance frameworks can specify how AI-assisted outputs should be judged in learning-intensive settings. The central problem is proxy failure: a polished artifa...

#ArXiv#Machine Learning#Academic

Tool• Apr 24, 2026

Exploring Data Augmentation and Resampling Strategies for Transformer-Based Models to Address Class Imbalance in AI Scoring of Scientific Explanations in NGSS Classroom

arXiv:2604.19754v1 Announce Type: new Abstract: Automated scoring of students' scientific explanations offers the potential for immediate, accurate feedback, yet class imbalance in rubric categories particularly those capturing advanced reasoning remains a challenge. This study investigates augment...

#ArXiv#Machine Learning#Academic

Tool• Apr 24, 2026

Explainable AML Triage with LLMs: Evidence Retrieval and Counterfactual Checks

arXiv:2604.19755v1 Announce Type: new Abstract: Anti-money laundering (AML) transaction monitoring generates large volumes of alerts that must be rapidly triaged by investigators under strict audit and governance constraints. While large language models (LLMs) can summarize heterogeneous evidence a...

#ArXiv#Machine Learning#Academic

Tool• Apr 23, 2026

Expert Upcycling: Shifting the Compute-Efficient Frontier of Mixture-of-Experts

arXiv:2604.19835v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) has become the dominant architecture for scaling large language models: frontier models routinely decouple total parameters from per-token computation through sparse expert routing. Scaling laws show that under fixed active co...

#ArXiv#Machine Learning#Academic

Tool• Apr 23, 2026

WorkflowGen:an adaptive workflow generation mechanism driven by trajectory experience

arXiv:2604.19756v1 Announce Type: new Abstract: Large language model (LLM) agents often suffer from high reasoning overhead, excessive token consumption, unstable execution, and inability to reuse past experiences in complex tasks like business queries, tool use, and workflow orchestration. Traditi...

#ArXiv#Machine Learning#Academic

Tool• Apr 23, 2026

Transparent Screening for LLM Inference and Training Impacts

arXiv:2604.19757v1 Announce Type: new Abstract: This paper presents a transparent screening framework for estimating inference and training impacts of current large language models under limited observability. The framework converts natural-language application descriptions into bounded environment...

#ArXiv#Machine Learning#Academic

Tool• Apr 22, 2026

FASE : A Fairness-Aware Spatiotemporal Event Graph Framework for Predictive Policing

arXiv:2604.18644v1 Announce Type: new Abstract: Predictive policing systems that allocate patrol resources based solely on predicted crime risk can unintentionally amplify racial disparities through feedback driven data bias. We present FASE, a Fairness Aware Spatiotemporal Event Graph framework, w...

#ArXiv#Machine Learning#Academic

Tool• Apr 22, 2026

Curiosity-Critic: Cumulative Prediction Error Improvement as a Tractable Intrinsic Reward for World Model Training

arXiv:2604.18701v1 Announce Type: new Abstract: Local prediction-error-based curiosity rewards focus on the current transition without considering the world model's cumulative prediction error across all visited transitions. We introduce Curiosity-Critic, which grounds its intrinsic reward in the i...

#ArXiv#Machine Learning#Academic

Tool• Apr 22, 2026

Easy Samples Are All You Need: Self-Evolving LLMs via Data-Efficient Reinforcement Learning

arXiv:2604.18639v1 Announce Type: new Abstract: Previous LLMs-based RL studies typically follow either supervised learning with high annotation costs, or unsupervised paradigms using voting or entropy-based rewards. However, their performance remains far from satisfactory due to the substantial ann...

#ArXiv#Machine Learning#Academic

Tool• Apr 22, 2026

Compile to Compress: Boosting Formal Theorem Provers by Compiler Outputs

arXiv:2604.18587v1 Announce Type: new Abstract: Large language models (LLMs) have demonstrated significant potential in formal theorem proving, yet state-of-the-art performance often necessitates prohibitive test-time compute via massive roll-outs or extended context windows. In this work, we addre...

#ArXiv#Machine Learning#Academic

Tool• Apr 22, 2026

AI scientists produce results without reasoning scientifically

arXiv:2604.18805v1 Announce Type: new Abstract: Large language model (LLM)-based systems are increasingly deployed to conduct scientific research autonomously, yet whether their reasoning adheres to the epistemic norms that make scientific inquiry self-correcting is poorly understood. Here, we eval...

#ArXiv#Machine Learning#Academic

Tool• Apr 22, 2026

ARES: Adaptive Red-Teaming and End-to-End Repair of Policy-Reward System

arXiv:2604.18789v1 Announce Type: new Abstract: Reinforcement Learning from Human Feedback (RLHF) is central to aligning Large Language Models (LLMs), yet it introduces a critical vulnerability: an imperfect Reward Model (RM) can become a single point of failure when it fails to penalize unsafe beh...

#ArXiv#Machine Learning#Academic

Tool• Apr 22, 2026

Quantum inspired qubit qutrit neural networks for real time financial forecasting

arXiv:2604.18838v1 Announce Type: new Abstract: This research investigates the performance and efficacy of machine learning models in stock prediction, comparing Artificial Neural Networks (ANNs), Quantum Qubit-based Neural Networks (QQBNs), and Quantum Qutrit-based Neural Networks (QQTNs). By outl...

#ArXiv#Machine Learning#Academic

Tool• Apr 22, 2026

Beyond One Output: Visualizing and Comparing Distributions of Language Model Generations

arXiv:2604.18724v1 Announce Type: new Abstract: Users typically interact with and evaluate language models via single outputs, but each output is just one sample from a broad distribution of possible completions. This interaction hides distributional structure such as modes, uncommon edge cases, an...

#ArXiv#Machine Learning#Academic

Tool• Apr 21, 2026

UniMamba: A Unified Spatial-Temporal Modeling Framework with State-Space and Attention Integration

arXiv:2604.16325v1 Announce Type: new Abstract: Multivariate time series forecasting is fundamental to numerous domains such as energy, finance, and environmental monitoring, where complex temporal dependencies and cross-variable interactions pose enduring challenges. Existing Transformer-based met...

#ArXiv#Machine Learning#Academic

Tool• Apr 21, 2026

A Discordance-Aware Multimodal Framework with Multi-Agent Clinical Reasoning

arXiv:2604.16333v1 Announce Type: new Abstract: Knee osteoarthritis frequently exhibits discordance between structural damage observed in imaging and patient-reported symptoms such as pain. This mismatch complicates clinical interpretation and patient stratification and remains insufficiently model...

#ArXiv#Machine Learning#Academic

1...10 11 12 13 14...39