TDS Newsletter: December Must-Reads on GraphRAG, Data Contracts, and More
Don't miss our most popular articles of the previous month
The post TDS Newsletter: December Must-Reads on GraphRAG, Data Contracts, and More appeared first on Towards Data Science.
Most RAG demos stop at “upload a PDF and ask a question.” That proves the pipeline works. It doesn’t prove you understand it. These projects are designed to break in interesting ways. They surface bias, contradictions, forgotten context, and overconfident answers. That’s where real RAG learning star...
How to Improve the Performance of Visual Anomaly Detection Models
Apply the best methods from academia to get the most out of practical applications
The post How to Improve the Performance of Visual Anomaly Detection Models appeared first on Towards Data Science.
HNSW at Scale: Why Your RAG System Gets Worse as the Vector Database Grows
How approximate vector search silently degrades Recall—and what to do about It
The post HNSW at Scale: Why Your RAG System Gets Worse as the Vector Database Grows appeared first on Towards Data Science.
Win 2026! 9 AI Prompts to Enter Beast Mode This New Year
The beginning of a new year brings about a new sense of energy in most. One may argue that it is all psychological, as nothing changes other than the date. Agreed, to a point. Though it is psychological, the change is not just based on the onset of a “new year.” A deep-rooted logical reasoning […]
T...
Why Supply Chain is the Best Domain for Data Scientists in 2026 (And How to Learn It)
My take after 10 years in Supply Chain on why this can be an excellent playground for data scientists who want to see their skills valued.
The post Why Supply Chain is the Best Domain for Data Scientists in 2026 (And How to Learn It) appeared first on Towards Data Science.
I get the AI scare, and if I am being honest here, you should take it seriously too. The AI age is unfolding fast, and we are seeing automation enter just about every sector. Once it does, there is absolutely no argument that the entire dynamics of human roles will change. So, for most of […]
The po...
Dummy Variable Trap in Machine Learning Explained Simply
In machine learning with categorical data, it is common to encode the categories as dummy variables (sometimes called one hot encoding) to encode categories as numerical values. This is a significant step since there are many algorithms that do not operate on other things other than numbers like lin...
In this article, we retroactively analyze what I would consider the ten most consequential, broadly impactful AI storylines of 2025, and gain insight into where the field is going in 2026.
Part 2: Avoiding burnout, learning strategies and the superpower of solitude
The post The Best Data Scientists Are Always Learning appeared first on Towards Data Science.
GliNER2: Extracting Structured Information from Text
From unstructured text to structured Knowledge Graphs
The post GliNER2: Extracting Structured Information from Text appeared first on Towards Data Science.
Data science powers decision-making across modern businesses, from data preparation and automation to advanced analytics and machine learning. Learning it requires a strong foundation in mathematics, statistics, programming, and practical problem-solving. The good news is that data science can be se...
AI-based coding agents are changing developer workflows. Proof – the arrival of Gemini 3 Pro in the Gemini CLI. It shows a significant advancement. For instance, it provides advanced reasoning, enhanced tool usage, and natural-language coding right in the terminal. Developers will be able to generat...
LangChain vs LangGraph vs LangSmith vs LangFlow: Choosing the Right LLM Toolkit
The LangChain ecosystem provides an important set of tools with which to construct an application using Large Language Models (LLMs). However, when the names of the companies such as LangChain, LangGraph, LangSmith, and LangFlow are mentioned, it is often difficult to know where to begin. This is a ...
YOLOv1 Loss Function Walkthrough: Regression for All
An explanation of how YOLOv1 measures the correctness of its object detection and classification predictions
The post YOLOv1 Loss Function Walkthrough: Regression for All appeared first on Towards Data Science.
How to Filter for Dates, Including or Excluding Future Dates, in Semantic Models
It is common to have either planning data or the previous year's data displayed beyond today's date. But future data can be confusing. How can I add a Slicer to show or hide future data? Let’s see how to do it.
The post How to Filter for Dates, Including or Excluding Future Dates, in Semantic Models...
How to Structure Your Data Science Project (With Frameworks & Best Practices)
Ever felt lost in messy folders, so many scripts, and unorganized code? That chaos only slows you down and hardens the data science journey. Organized workflows and project structures are not just nice-to-have, because it affects the reproducibility, collaboration and understanding of what’s happeni...