Stay ahead of the generative AI revolution!Join the M5B Newsletter →

AI Tools & Frameworks Directory

Discover essential tools, libraries, and frameworks to power your AI workflows.

All Engineering Hardware Jobs News Research Tools Tutorials

News AI TechCrunch Analytics Vidhya Data Science Towards Data Science Medium GenAI Textual OpenAI Google MIT Microsoft HuggingFace OpenSource Models NVIDIA GPU Enterprise ArXiv

Tool• Jun 9, 2026

OmniMem: Perturbation-aware Memory Compression for Streaming Audio-Visual LLMs

arXiv:2606.07577v1 Announce Type: new Abstract: Audio-visual large language models (LLMs) hold strong promise for long-form video understanding, yet their long-video inference is fundamentally limited by the linear growth of video tokens and key-value (KV) caches. We present OmniMem, a memory-effic...

#ArXiv#Machine Learning#Academic

Tool• Jun 9, 2026

Syll: Open-Source Personal Automation with Cross-Surface Execution

arXiv:2606.07594v1 Announce Type: new Abstract: Personal AI agents must increasingly operate across APIs, shells, web surfaces, and desktop GUIs, yet many systems remain tuned to a single interface and offer limited support for user teaching and auditability. We present Syll, an open-source, self-h...

#ArXiv#Machine Learning#Academic

Tool• Jun 9, 2026

PathoSage: Towards Multi-Source Evidence Adjudication in Pathology via Experience-Aware Agentic Workflow

arXiv:2606.07549v1 Announce Type: new Abstract: Recent advances in Multimodal Large Language Models (MLLMs) and agent workflows have shown strong promise for computational pathology, yet reliable patch-level reasoning remains challenging. End-to-end pathology MLLMs often hallucinate morphological f...

#ArXiv#Machine Learning#Academic

Tool• Jun 9, 2026

MedicalRec: Medical recommender system for image classification without retraining

arXiv:2606.07553v1 Announce Type: new Abstract: The emergence of machine learning and deep learning has revolutionized the efficiency of diagnostic, therapeutic, and administrative systems in healthcare. However, this rapid adoption has come at the cost of requiring significant computing power and ...

#ArXiv#Machine Learning#Academic

Tool• Jun 9, 2026

SPIN: Decentralized Swarm Control via Tensorized Policy Coordination

arXiv:2606.07557v1 Announce Type: new Abstract: Decentralized multi-agent swarm coordination on resource-constrained edge platforms remains fundamentally bottlenecked by the exponential scaling of joint action spaces and high-latency communication overhead. This paper introduces the Swarm Policy In...

#ArXiv#Machine Learning#Academic

Tool• Jun 9, 2026

Emergence via Phase Transitions: Mechanism Landscapes and Universal Convergence Across Complex Systems

arXiv:2606.07563v1 Announce Type: new Abstract: Across machine learning, biology, and physics, independently evolving systems often converge toward strikingly similar high-level structures despite radically different microscopic details. Grokking circuits converge across random seeds, evolutionary ...

#ArXiv#Machine Learning#Academic

Tool• Jun 9, 2026

Boundary Variance Inflation Causes Acquisition Bias in Gaussian Processes

arXiv:2606.07561v1 Announce Type: new Abstract: Gaussian processes with stationary kernels on bounded domains exhibit inflated posterior variance near the boundary. Despite being a long-recognized artifact in geostatistics and a source of over-exploration in Bayesian optimization, the causes and ef...

#ArXiv#Machine Learning#Academic

Tool• Jun 8, 2026

DiBS: Diffusion-Informed Branch Selection

arXiv:2606.06518v1 Announce Type: new Abstract: Sudoku is a representative constraint satisfaction problem that requires global structural reasoning under strict discrete constraints. The existing works of solving Sudoku mainly focus on two dominant approaches, i.e., traditional heuristic and deep ...

#ArXiv#Machine Learning#Academic

Tool• Jun 8, 2026

Lean4Agent: Formal Modeling and Verification for Agent Workflow and Trajectory

arXiv:2606.06523v1 Announce Type: new Abstract: Equipping Large Language Models (LLMs) to execute reliable multi-step workflows has become a central challenge in artificial intelligence. Despite recent advances in LLMs' agentic capabilities, most agent systems still lack formal methods for specifyi...

#ArXiv#Machine Learning#Academic

Tool• Jun 8, 2026

CrowdMath: A Dataset of Crowdsourced Mathematical Research Discussions

arXiv:2606.06526v1 Announce Type: new Abstract: Large language models have made substantial progress on mathematical reasoning, but existing benchmarks typically evaluate well-specified problems with final answers, step-by-step solutions, or complete proofs. They do not capture collaborative open-p...

#ArXiv#Machine Learning#Academic

Tool• Jun 8, 2026

SafeGene: Reusable Adapters for Transferable Safety Alignment

arXiv:2606.06519v1 Announce Type: new Abstract: Open-weight LLMs are increasingly fine-tuned into customized assistants, but downstream fine-tuning can weaken safety alignment and make models more vulnerable to malicious prompts, even when the training data is not intentionally harmful. This create...

#ArXiv#Machine Learning#Academic

Tool• Jun 8, 2026

Elmes*: Automated Construction of Fine-Grained Evaluation Rubrics for Large Language Models in Long-Tail Educational Scenarios

arXiv:2606.06546v1 Announce Type: new Abstract: Evaluating large language models (LLMs) for education requires measuring how models teach, not only what they know. Existing benchmarks emphasize domain-general correctness or depend on manually designed rubrics that scale poorly to long-tail pedagogi...

#ArXiv#Machine Learning#Academic

Tool• Jun 8, 2026

FAIR-Calib: Frontier-Aware Instability-Reweighted Calibration for Post-Training Quantization of Diffusion Large Language Models

arXiv:2606.06547v1 Announce Type: new Abstract: Diffusion Large Language Models (dLLMs) refine tokens iteratively but commit them irreversibly, leading to a "stability lag" where early decisions remain fragile even after being written. We reveal that Post-Training Quantization (PTQ) error easily fl...

#ArXiv#Machine Learning#Academic

Tool• Jun 8, 2026

MacArena: Benchmarking Computer Use Agents on an Online macOS Environment

arXiv:2606.06560v1 Announce Type: new Abstract: Computer-use agents (CUAs) operate graphical user interfaces (GUIs) through vision and control primitives, and their capabilities have advanced rapidly, driven in part by standardized online evaluation benchmarks such as OSWorld, which serve both as e...

#ArXiv#Machine Learning#Academic

Tool• Jun 5, 2026

I Know What You Meme, Even If it Emerged Today: Understanding Evolving Memes through Open-World Knowledge Acquisition

arXiv:2606.05316v1 Announce Type: new Abstract: Multimodal memes are dynamic and often require up to date background knowledge for interpretation. Existing methods often overlook such knowledge or rely on fixed parametric knowledge of pretrained models that may be incomplete, outdated, or unavailab...

#ArXiv#Machine Learning#Academic

Tool• Jun 5, 2026

GITCO: Gated Inference-Time Context Optimization in TSFMs

arXiv:2606.05332v1 Announce Type: new Abstract: Patch-based Time Series Foundation Models (TSFMs) suffer from context poisoning: structurally anomalous patches capture disproportionate attention and silently degrade zero-shot forecast quality. We propose improving TSFM accuracy at inference time by...

#ArXiv#Machine Learning#Academic

Tool• Jun 5, 2026

Uncertainty Aware Functional Behavior Prediction and Material Fatigue Assessment for Circular Factory

arXiv:2606.05334v1 Announce Type: new Abstract: Returned products in circular factories re-enter production with heterogeneous degradation states, usage histories, and remaining capability. Reuse cannot be decided from the current inspection alone, because future function fulfillment and component ...

#ArXiv#Machine Learning#Academic

Tool• Jun 5, 2026

How Far Did They Go? The Persuasive Tactics of Covert LLM Agents in a Discontinued Field Experiment

arXiv:2606.05256v1 Announce Type: new Abstract: This study analyzes a publicly released dataset from a discontinued field experiment on Reddit's r/ChangeMyView. The intervention, conducted by unknown, external researchers and halted following ethical backlash, involved undisclosed AI-generated acco...

#ArXiv#Machine Learning#Academic

Tool• Jun 5, 2026

What Should Agents Say? Action-state Communication for Efficient Multi-Agent Systems

arXiv:2606.05304v1 Announce Type: new Abstract: Multi-agent systems (MAS) built on large language models are typically organized around roles, pipelines, and turn schedules, while the content that agents pass to one another is often left as unconstrained natural language. However, this free-form co...

#ArXiv#Machine Learning#Academic

Tool• Jun 5, 2026

Staged Factorial Screening for Budget-Constrained Micro-Pretraining

arXiv:2606.05186v1 Announce Type: new Abstract: Budget-constrained micro-pretraining often requires triaging many candidate recipes on a shared accelerator before larger search budgets are spent. We study whether a staged fractional-factorial workflow can recover stable early effect structure in th...

#ArXiv#Machine Learning#Academic

Tool• Jun 5, 2026

ERRORQUAKE: Heavy-Tailed Error Severity Distributions in Open-Weight Large Language Models

arXiv:2606.05170v1 Announce Type: new Abstract: At matched accuracy, open-weight LLMs differ substantially in the shape of their error severity distribution -- a difference invisible to the scalar error rate. Hallucination benchmarks report a single error count and treat all errors as equivalent, y...

#ArXiv#Machine Learning#Academic

Tool• Jun 5, 2026

Temporal Preference Concepts and their Functions in a Large Language Model

arXiv:2606.05194v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly being deployed to make decisions that require trading off near-term gains against long-term consequences, yet little is known about how they internally represent or resolve these tradeoffs. In this work, w...

#ArXiv#Machine Learning#Academic

Tool• Jun 5, 2026

The Evaluation Blind Spot: A Stereological Theory of Benchmark Coverage for Large Language Models

arXiv:2606.05169v1 Announce Type: new Abstract: We give a stereological theory of LLM benchmark coverage. For any suite with effective dimensionality d_eff, the visible Hausdorff distance between two convex capability profiles consistent with the same scores is bounded by epsilon + C R m^(-1/(d_eff...

#ArXiv#Machine Learning#Academic

Tool• Jun 4, 2026

Thinking Through Signs: PEEL as a Semiotic Scaffolding for Epistemically Accountable AI-Enabled Research

arXiv:2606.04152v1 Announce Type: new Abstract: Large language models are reshaping research practice while quietly eroding researchers epistemic accountability. This commentary introduces PEEL - Protocols for Epistemically Engaged Literacy in AI, a working scaffolding that combines deterministic d...

#ArXiv#Machine Learning#Academic

1...9 10 11 12 13...48