Deep Reinforcement Learning: The Actor-Critic Method
Robot friends collaborate to learn to fly a drone
The post Deep Reinforcement Learning: The Actor-Critic Method appeared first on Towards Data Science.
Google T5Gemma-2 Explained: Trying Out a Laptop-Friendly Multimodal AI Model
Google just dropped T5Gemma-2, and it is a game-changer for someone working with AI models on everyday hardware. Built on the Gemma 3 family, this encoder-decoder powerhouse squeezes multimodal smarts and massive context into tiny packages. Imagine running 270M parameters running smoothly on your la...
Train Your Large Model on Multiple GPUs with Tensor Parallelism
This article is divided into five parts; they are: • An Example of Tensor Parallelism • Setting Up Tensor Parallelism • Preparing Model for Tensor Parallelism • Train a Model with Tensor Parallelism • Combining Tensor Parallelism with FSDP Tensor parallelism originated from the Megatron-LM paper.
Production-Ready LLMs Made Simple with the NeMo Agent Toolkit
From simple chat to multi-agent reasoning and real-time REST APIs
The post Production-Ready LLMs Made Simple with the NeMo Agent Toolkit appeared first on Towards Data Science.
Student ID Benefits Worth Thousands: Get 15+ Premium Tools For Free or on Discount
I remember from my student days the plethora of subscriptions, fees, and payments to be made for a range of tasks. Be it learning a new skill, using the right environment for practice, or simply travelling to and from home, we had to shell money out of our pockets. But it is almost 2026 now, […]
The...
Train Your Large Model on Multiple GPUs with Fully Sharded Data Parallelism
This article is divided into five parts; they are: • Introduction to Fully Sharded Data Parallel • Preparing Model for FSDP Training • Training Loop with FSDP • Fine-Tuning FSDP Behavior • Checkpointing FSDP Models Sharding is a term originally used in database management systems, where it refers to...
The Machine Learning “Advent Calendar” Bonus 1: AUC in Excel
AUC measures how well a model ranks positives above negatives, independent of any chosen threshold.
The post The Machine Learning “Advent Calendar” Bonus 1: AUC in Excel appeared first on Towards Data Science.
As a developer, tell me if you relate to this – Docker commands are easy to understand but difficult to apply meaningfully. Out of the countless tutorials that I followed, most stopped at syntax, leaving me unsure about what to build next. (Here is an exception – A step-by-step Docker tutorial for c...
Train Your Large Model on Multiple GPUs with Pipeline Parallelism
This article is divided into six parts; they are: • Pipeline Parallelism Overview • Model Preparation for Pipeline Parallelism • Stage and Pipeline Schedule • Training Loop • Distributed Checkpointing • Limitations of Pipeline Parallelism Pipeline parallelism means creating the model as a pipeline o...
Implementing Vibe Proving with Reinforcement Learning
How to make LLMs reason with verifiable, step-by-step logic (Part 2)
The post Implementing Vibe Proving with Reinforcement Learning appeared first on Towards Data Science.
Google A2UI Explained: How AI Agents Build Secure, Native User Interfaces
We have entered the time of multi-agent artificial intelligence. However, there is a very important issue: in what way can remote AI agents produce rich and interactive experiences without exposing the system to security risks? Google A2UI (Agent-to-UI) protocol addresses this question in a very sma...
Exploring TabPFN: A Foundation Model Built for Tabular Data
Understanding the architecture, training pipeline and implementing TabPFN in practice
The post Exploring TabPFN: A Foundation Model Built for Tabular Data appeared first on Towards Data Science.
How IntelliNode Automates Complex Workflows with Vibe Agents
Many AI systems focus on isolated tasks or simple prompt engineering. This approach allowed us to build interesting applications from a single prompt, but we are starting to hit a limit. Simple prompting falls short when we tackle complex AI tasks that require multiple stages or enterprise systems t...
Agent creation has become easier than ever but have you ever thought – how can we make them more powerful than they already are? I recently thought of one possible way – what if they had realtime information about specific categories like finance and movies. That would be really cool, right? While e...
Build Your Own NotebookLlama: A PDF to Podcast Pipeline (Open, Fast, and Fully Yours)
The NotebookLM is a relatively new Internet phenomenon, in which Google has distinguished itself, thanks to its Audio Overview mode – a mechanism that transforms the text in the paper into a two-person podcast. All of this, in a single click. But what should you do when you wish to build it yourself...
Training a Model on Multiple GPUs with Data Parallelism
This article is divided into two parts; they are: • Data Parallelism • Distributed Data Parallelism If you have multiple GPUs, you can combine them to operate as a single GPU with greater memory capacity.
Keeping Probabilities Honest: The Jacobian Adjustment
An intuitive explanation of transforming random variables correctly.
The post Keeping Probabilities Honest: The Jacobian Adjustment appeared first on Towards Data Science.
The Machine Learning “Advent Calendar” Day 24: Transformers for Text in Excel
An intuitive, step-by-step look at how Transformers use self-attention to turn static word embeddings into contextual representations, illustrated with simple examples and an Excel-friendly walkthrough.
The post The Machine Learning “Advent Calendar” Day 24: Transformers for Text in Excel appeared f...
Training a Model with Limited Memory using Mixed Precision and Gradient Checkpointing
This article is divided into three parts; they are: • Floating-point Numbers • Automatic Mixed Precision Training • Gradient Checkpointing Let's get started! The default data type in PyTorch is the IEEE 754 32-bit floating-point format, also known as single precision.