AI Tools & Frameworks Directory

Discover essential tools, libraries, and frameworks to power your AI workflows.

Tool• Feb 11, 2026

Spectral Disentanglement and Enhancement: A Dual-domain Contrastive Framework for Representation Learning

arXiv:2602.09066v1 Announce Type: new Abstract: Large-scale multimodal contrastive learning has recently achieved impressive success in learning rich and transferable representations, yet it remains fundamentally limited by the uniform treatment of feature dimensions and the neglect of the intrinsi...

#ArXiv#Machine Learning#Academic

Tool• Feb 11, 2026

Learning to Remember, Learn, and Forget in Attention-Based Models

arXiv:2602.09075v1 Announce Type: new Abstract: In-Context Learning (ICL) in transformers acts as an online associative memory and is believed to underpin their high performance on complex sequence processing tasks. However, in gated linear attention models, this memory has a fixed capacity and is ...

#ArXiv#Machine Learning#Academic

Tool• Feb 11, 2026

Patient foundation model for risk stratification in low-risk overweight patients

arXiv:2602.09079v1 Announce Type: new Abstract: Accurate risk stratification in patients with overweight or obesity is critical for guiding preventive care and allocating high-cost therapies such as GLP-1 receptor agonists. We present PatientTPP, a neural temporal point process (TPP) model trained ...

#ArXiv#Machine Learning#Academic

Tool• Feb 11, 2026

Looping Back to Move Forward: Recursive Transformers for Efficient and Flexible Large Multimodal Models

arXiv:2602.09080v1 Announce Type: new Abstract: Large Multimodal Models (LMMs) have achieved remarkable success in vision-language tasks, yet their vast parameter counts are often underutilized during both training and inference. In this work, we embrace the idea of looping back to move forward: re...

#ArXiv#Machine Learning#Academic

Tool• Feb 11, 2026

A Small-Scale System for Autoregressive Program Synthesis Enabling Controlled Experimentation

arXiv:2602.09112v1 Announce Type: new Abstract: What research can be pursued with small models trained to complete true programs? Typically, researchers study program synthesis via large language models (LLMs) which introduce issues such as knowing what is in or out of distribution, understanding f...

#ArXiv#Machine Learning#Academic

Tool• Feb 11, 2026

Uncertainty-Aware Multimodal Emotion Recognition through Dirichlet Parameterization

arXiv:2602.09121v1 Announce Type: new Abstract: In this work, we present a lightweight and privacy-preserving Multimodal Emotion Recognition (MER) framework designed for deployment on edge devices. To demonstrate framework's versatility, our implementation uses three modalities - speech, text and f...

#ArXiv#Machine Learning#Academic

Tool• Feb 11, 2026

PABU: Progress-Aware Belief Update for Efficient LLM Agents

arXiv:2602.09138v1 Announce Type: new Abstract: Large Language Model (LLM) agents commonly condition actions on full action-observation histories, which introduce task-irrelevant information that easily leads to redundant actions and higher inference cost. We propose Progress-Aware Belief Update (P...

#ArXiv#Machine Learning#Academic

Tool• Feb 11, 2026

CoMMa: Contribution-Aware Medical Multi-Agents From A Game-Theoretic Perspective

arXiv:2602.09159v1 Announce Type: new Abstract: Recent multi-agent frameworks have broadened the ability to tackle oncology decision support tasks that require reasoning over dynamic, heterogeneous patient data. We propose Contribution-Aware Medical Multi-Agents (CoMMa), a decentralized LLM-agent f...

#ArXiv#Machine Learning#Academic

Tool• Feb 11, 2026

FlyAOC: Evaluating Agentic Ontology Curation of Drosophila Scientific Knowledge Bases

arXiv:2602.09163v1 Announce Type: new Abstract: Scientific knowledge bases accelerate discovery by curating findings from primary literature into structured, queryable formats for both human researchers and emerging AI systems. Maintaining these resources requires expert curators to search relevant...

#ArXiv#Machine Learning#Academic

Tool• Feb 10, 2026

Attractor Patch Networks: Reducing Catastrophic Forgetting with Routed Low-Rank Patch Experts

arXiv:2602.06993v1 Announce Type: new Abstract: Transformers achieve strong language modeling accuracy, yet their position-wise feed-forward networks (FFNs) are dense, globally shared, and typically updated end to end. These properties create two practical tensions. First, dense FFNs spend the same...

#ArXiv#Machine Learning#Academic

Tool• Feb 10, 2026

Neural Sabermetrics with World Model: Play-by-play Predictive Modeling with Large Language Model

arXiv:2602.07030v1 Announce Type: new Abstract: Classical sabermetrics has profoundly shaped baseball analytics by summarizing long histories of play into compact statistics. While these metrics are invaluable for valuation and retrospective analysis, they do not define a generative model of how ba...

#ArXiv#Machine Learning#Academic

Tool• Feb 10, 2026

TransConv-DDPM: Enhanced Diffusion Model for Generating Time-Series Data in Healthcare

arXiv:2602.07033v1 Announce Type: new Abstract: The lack of real-world data in clinical fields poses a major obstacle in training effective AI models for diagnostic and preventive tools in medicine. Generative AI has shown promise in increasing data volume and enhancing model training, particularly...

#ArXiv#Machine Learning#Academic

Tool• Feb 10, 2026

AVERE: Improving Audiovisual Emotion Reasoning with Preference Optimization

arXiv:2602.07054v1 Announce Type: new Abstract: Emotion understanding is essential for building socially intelligent agents. Although recent multimodal large language models have shown strong performance on this task, two key challenges remain - spurious associations between emotions and irrelevant...

#ArXiv#Machine Learning#Academic

Tool• Feb 10, 2026

LLM-FSM: Scaling Large Language Models for Finite-State Reasoning in RTL Code Generation

arXiv:2602.07032v1 Announce Type: new Abstract: Finite-state reasoning, the ability to understand and implement state-dependent behavior, is central to hardware design. In this paper, we present LLM-FSM, a benchmark that evaluates how well large language models (LLMs) can recover finite-state machi...

#ArXiv#Machine Learning#Academic

Tool• Feb 10, 2026

ST-Raptor: An Agentic System for Semi-Structured Table QA

arXiv:2602.07034v1 Announce Type: new Abstract: Semi-structured table question answering (QA) is a challenging task that requires (1) precise extraction of cell contents and positions and (2) accurate recovery of key implicit logical structures, hierarchical relationships, and semantic associations...

#ArXiv#Machine Learning#Academic

Tool• Feb 10, 2026

DLLM-Searcher: Adapting Diffusion Large Language Model for Search Agents

arXiv:2602.07035v1 Announce Type: new Abstract: Recently, Diffusion Large Language Models (dLLMs) have demonstrated unique efficiency advantages, enabled by their inherently parallel decoding mechanism and flexible generation paradigm. Meanwhile, despite the rapid advancement of Search Agents, thei...

#ArXiv#Machine Learning#Academic

Tool• Feb 10, 2026

Aster: Autonomous Scientific Discovery over 20x Faster Than Existing Methods

arXiv:2602.07040v1 Announce Type: new Abstract: We introduce Aster, an AI agent for autonomous scientific discovery capable of operating over 20 times faster than existing frameworks. Given a task, an initial program, and a script to evaluate the performance of the program, Aster iteratively improv...

#ArXiv#Machine Learning#Academic

Tool• Feb 10, 2026

Theory of Space: Can Foundation Models Construct Spatial Beliefs through Active Exploration?

arXiv:2602.07055v1 Announce Type: new Abstract: Spatial embodied intelligence requires agents to act to acquire information under partial observability. While multimodal foundation models excel at passive perception, their capacity for active, self-directed exploration remains understudied. We prop...

#ArXiv#Machine Learning#Academic

Tool• Feb 10, 2026

Parallel Track Transformers: Enabling Fast GPU Inference with Reduced Synchronization

Efficient large-scale inference of transformer-based large language models (LLMs) remains a fundamental systems challenge, frequently requiring multi-GPU parallelism to meet stringent latency and throughput targets. Conventional tensor parallelism decomposes matrix operations across devices but intr...

#Apple#On-device AI

Tool• Feb 9, 2026

Unpacking the craft of an applied machine learning product manager

Great ML models don’t guarantee great products. Applied ML PMs turn model performance into experiences users can trust, understand, and value.

#AI Accelerator Institute#AI#Research

Tool• Feb 9, 2026

NanoNet: Parameter-Efficient Learning with Label-Scarce Supervision for Lightweight Text Mining Model

arXiv:2602.06093v1 Announce Type: new Abstract: The lightweight semi-supervised learning (LSL) strategy provides an effective approach of conserving labeled samples and minimizing model inference costs. Prior research has effectively applied knowledge transfer learning and co-training regularizatio...

#ArXiv#Machine Learning#Academic

Tool• Feb 9, 2026

Agentic Workflow Using RBA$_\theta$ for Event Prediction

arXiv:2602.06097v1 Announce Type: new Abstract: Wind power ramp events are difficult to forecast due to strong variability, multi-scale dynamics, and site-specific meteorological effects. This paper proposes an event-first, frequency-aware forecasting paradigm that directly predicts ramp events and...

#ArXiv#Machine Learning#Academic

Tool• Feb 9, 2026

Toward Faithful and Complete Answer Construction from a Single Document

arXiv:2602.06103v1 Announce Type: new Abstract: Modern large language models (LLMs) are powerful generators driven by statistical next-token prediction. While effective at producing fluent text, this design biases models toward high-probability continuations rather than exhaustive and faithful answ...

#ArXiv#Machine Learning#Academic

Tool• Feb 9, 2026

Pragmatic Curiosity: A Hybrid Learning-Optimization Paradigm via Active Inference

arXiv:2602.06104v1 Announce Type: new Abstract: Many engineering and scientific workflows depend on expensive black-box evaluations, requiring decision-making that simultaneously improves performance and reduces uncertainty. Bayesian optimization (BO) and Bayesian experimental design (BED) offer po...

#ArXiv#Machine Learning#Academic

← Prev

1...21 22 23 24 25...37