AI Tools & Frameworks Directory

Discover essential tools, libraries, and frameworks to power your AI workflows.

Tool• Feb 5, 2026

AgentArk: Distilling Multi-Agent Intelligence into a Single LLM Agent

arXiv:2602.03955v1 Announce Type: new Abstract: While large language model (LLM) multi-agent systems achieve superior reasoning performance through iterative debate, practical deployment is limited by their high computational cost and error propagation. This paper proposes AgentArk, a novel framewo...

#ArXiv#Machine Learning#Academic

Tool• Feb 5, 2026

Adaptive Test-Time Compute Allocation via Learned Heuristics over Categorical Structure

arXiv:2602.03975v1 Announce Type: new Abstract: Test-time computation has become a primary driver of progress in large language model (LLM) reasoning, but it is increasingly bottlenecked by expensive verification. In many reasoning systems, a large fraction of verifier calls are spent on redundant ...

#ArXiv#Machine Learning#Academic

Tool• Feb 4, 2026

Augmenting Parameter-Efficient Pre-trained Language Models with Large Language Models

arXiv:2602.02501v1 Announce Type: new Abstract: Training AI models in cybersecurity with help of vast datasets offers significant opportunities to mimic real-world behaviors effectively. However, challenges like data drift and scarcity of labelled data lead to frequent updates of models and the ris...

#ArXiv#Machine Learning#Academic

Tool• Feb 4, 2026

What Drives Length of Stay After Elective Spine Surgery? Insights from a Decade of Predictive Modeling

arXiv:2602.02517v1 Announce Type: new Abstract: Objective: Predicting length of stay after elective spine surgery is essential for optimizing patient outcomes and hospital resource use. This systematic review synthesizes computational methods used to predict length of stay in this patient populatio...

#ArXiv#Machine Learning#Academic

Tool• Feb 4, 2026

CreditAudit: 2$^\text{nd}$ Dimension for LLM Evaluation and Selection

arXiv:2602.02515v2 Announce Type: new Abstract: Leaderboard scores on public benchmarks have been steadily rising and converging, with many frontier language models now separated by only marginal differences. However, these scores often fail to match users' day to day experience, because system pro...

#ArXiv#Machine Learning#Academic

Tool• Feb 4, 2026

Experience-Driven Multi-Agent Systems Are Training-free Context-aware Earth Observers

arXiv:2602.02559v1 Announce Type: new Abstract: Recent advances have enabled large language model (LLM) agents to solve complex tasks by orchestrating external tools. However, these agents often struggle in specialized, tool-intensive domains that demand long-horizon execution, tight coordination a...

#ArXiv#Machine Learning#Academic

Tool• Feb 4, 2026

Uncertainty and Fairness Awareness in LLM-Based Recommendation Systems

arXiv:2602.02582v1 Announce Type: new Abstract: Large language models (LLMs) enable powerful zero-shot recommendations by leveraging broad contextual knowledge, yet predictive uncertainty and embedded biases threaten reliability and fairness. This paper studies how uncertainty and fairness evaluati...

#ArXiv#Machine Learning#Academic

Tool• Feb 4, 2026

A Positive Case for Faithfulness: LLM Self-Explanations Help Predict Model Behavior

arXiv:2602.02639v1 Announce Type: new Abstract: LLM self-explanations are often presented as a promising tool for AI oversight, yet their faithfulness to the model's true reasoning process is poorly understood. Existing faithfulness metrics have critical limitations, typically relying on identifyin...

#ArXiv#Machine Learning#Academic

Tool• Feb 3, 2026

OGD4All: A Framework for Accessible Interaction with Geospatial Open Government Data Based on Large Language Models

arXiv:2602.00012v1 Announce Type: new Abstract: We present OGD4All, a transparent, auditable, and reproducible framework based on Large Language Models (LLMs) to enhance citizens' interaction with geospatial Open Government Data (OGD). The system combines semantic data retrieval, agentic reasoning ...

#ArXiv#Machine Learning#Academic

Tool• Feb 3, 2026

Measurement for Opaque Systems: Multi-source Triangulation with Interpretable Machine Learning

arXiv:2602.00022v1 Announce Type: new Abstract: We propose a measurement framework for difficult-to-access contexts that uses indirect data traces, interpretable machine-learning models, and theory-guided triangulation to fill inaccessible measurement spaces. Many high-stakes systems of scientific ...

#ArXiv#Machine Learning#Academic

Tool• Feb 3, 2026

ELLMPEG: An Edge-based Agentic LLM Video Processing Tool

arXiv:2602.00028v1 Announce Type: new Abstract: Large language models (LLMs), the foundation of generative AI systems like ChatGPT, are transforming many fields and applications, including multimedia, enabling more advanced content generation, analysis, and interaction. However, cloud-based LLM dep...

#ArXiv#Machine Learning#Academic

Tool• Feb 3, 2026

RAPTOR-AI for Disaster OODA Loop: Hierarchical Multimodal RAG with Experience-Driven Agentic Decision-Making

arXiv:2602.00030v1 Announce Type: new Abstract: Effective humanitarian assistance and disaster relief (HADR) requires rapid situational understanding, reliable decision support, and the ability to generalize across diverse and previously unseen disaster contexts. This work introduces an agentic Ret...

#ArXiv#Machine Learning#Academic

Tool• Feb 3, 2026

Scalable and Secure AI Inference in Healthcare: A Comparative Benchmarking of FastAPI and Triton Inference Server on Kubernetes

arXiv:2602.00053v1 Announce Type: new Abstract: Efficient and scalable deployment of machine learning (ML) models is a prerequisite for modern production environments, particularly within regulated domains such as healthcare and pharmaceuticals. In these settings, systems must balance competing req...

#ArXiv#Machine Learning#Academic

Tool• Feb 3, 2026

Learning to Price: Interpretable Attribute-Level Models for Dynamic Markets

arXiv:2602.00188v1 Announce Type: new Abstract: Dynamic pricing in high-dimensional markets poses fundamental challenges of scalability, uncertainty, and interpretability. Existing low-rank bandit formulations learn efficiently but rely on latent features that obscure how individual product attribu...

#ArXiv#Machine Learning#Academic

Tool• Feb 3, 2026

From Gameplay Traces to Game Mechanics: Causal Induction with Large Language Models

arXiv:2602.00190v1 Announce Type: new Abstract: Deep learning agents can achieve high performance in complex game domains without often understanding the underlying causal game mechanics. To address this, we investigate Causal Induction: the ability to infer governing laws from observational data, ...

#ArXiv#Machine Learning#Academic

Tool• Feb 3, 2026

Localizing and Correcting Errors for LLM-based Planners

arXiv:2602.00276v1 Announce Type: new Abstract: Large language models (LLMs) have demonstrated strong reasoning capabilities on math and coding, but frequently fail on symbolic classical planning tasks. Our studies, as well as prior work, show that LLM-generated plans routinely violate domain const...

#ArXiv#Machine Learning#Academic

Tool• Feb 3, 2026

A Reinforcement Learning Based Universal Sequence Design for Polar Codes

To advance Polar code design for 6G applications, we develop a reinforcement learning-based universal sequence design framework that is extensible and adaptable to diverse channel conditions and decoding strategies. Crucially, our method scales to code lengths up to 2048, making it suitable for use ...

#Apple#On-device AI

Tool• Feb 2, 2026

Attention Isn't All You Need for Emotion Recognition:Domain Features Outperform Transformers on the EAV Dataset

arXiv:2601.22161v1 Announce Type: new Abstract: We present a systematic study of multimodal emotion recognition using the EAV dataset, investigating whether complex attention mechanisms improve performance on small datasets. We implement three model categories: baseline transformers (M1), novel fac...

#ArXiv#Machine Learning#Academic

Tool• Feb 2, 2026

Multitask Learning for Earth Observation Data Classification with Hybrid Quantum Network

arXiv:2601.22195v1 Announce Type: new Abstract: Quantum machine learning (QML) has gained increasing attention as a potential solution to address the challenges of computation requirements in the future. Earth observation (EO) has entered the era of Big Data, and the computational demands for effec...

#ArXiv#Machine Learning#Academic

Tool• Feb 2, 2026

Neural Signals Generate Clinical Notes in the Wild

arXiv:2601.22197v1 Announce Type: new Abstract: Generating clinical reports that summarize abnormal patterns, diagnostic findings, and clinical interpretations from long-term EEG recordings remains labor-intensive. We curate a large-scale clinical EEG dataset with $9{,}922$ reports paired with appr...

#ArXiv#Machine Learning#Academic

Tool• Feb 2, 2026

FedAdaVR: Adaptive Variance Reduction for Robust Federated Learning under Limited Client Participation

arXiv:2601.22204v1 Announce Type: new Abstract: Federated learning (FL) encounters substantial challenges due to heterogeneity, leading to gradient noise, client drift, and partial client participation errors, the last of which is the most pervasive but remains insufficiently addressed in current l...

#ArXiv#Machine Learning#Academic

Tool• Feb 2, 2026

JAF: Judge Agent Forest

arXiv:2601.22269v1 Announce Type: new Abstract: Judge agents are fundamental to agentic AI frameworks: they provide automated evaluation, and enable iterative self-refinement of reasoning processes. We introduce JAF: Judge Agent Forest, a framework in which the judge agent conducts joint inference ...

#ArXiv#Machine Learning#Academic

Tool• Feb 2, 2026

The Six Sigma Agent: Achieving Enterprise-Grade Reliability in LLM Systems Through Consensus-Driven Decomposed Execution

arXiv:2601.22290v1 Announce Type: new Abstract: Large Language Models demonstrate remarkable capabilities yet remain fundamentally probabilistic, presenting critical reliability challenges for enterprise deployment. We introduce the Six Sigma Agent, a novel architecture that achieves enterprise-gra...

#ArXiv#Machine Learning#Academic

Tool• Feb 2, 2026

Sparks of Rationality: Do Reasoning LLMs Align with Human Judgment and Choice?

arXiv:2601.22329v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly positioned as decision engines for hiring, healthcare, and economic judgment, yet real-world human judgment reflects a balance between rational deliberation and emotion-driven bias. If LLMs are to particip...

#ArXiv#Machine Learning#Academic

← Prev

1...23 24 25 26 27...37