Text-to-SQL will replace Data Analysts
The Data Path #12

Text-to-SQL will replace Data Analysts

Hi everyone 👋, welcome to a new edition of The Data Path!

For years, we’ve dreamed of a world where anyone, not just data analysts could ask a question in plain English and get back an SQL query or even a full dashboard.

That’s the promise behind Text-to-SQL tools. It doesn't matter the tool that you are picking, the message is always the same:

Forget SQL! Just talk to your data.

It sounds revolutionary. But in practice? It’s not that simple. Please grab a coffee ☕ or your favorite drink and have a read to this article!


The fundamentals behind text-to-SQL

Article content
End to end text-to-SQL pipeline

Text-to-SQL is the process of turning human language into structured logic.

When a user types a question like “Show me revenue by country for the last quarter”, several things happen behind the scenes:

  1. UNDERSTANDING: The model first interprets what the user is asking: “revenue” → metric, “country” → dimension, “last quarter” → time filter.
  2. SCHEMA MAPPING: The system maps those entities to objects in the database: which table contains “revenue”? how is “country” stored? This is where metadata and schema context become critical.
  3. SQL GENERATION: the model constructs a query that logically retrieves the requested data.
  4. VALIDATION and EXECUTION. The SQL is validated and the results are returned to the user in tabular format or graphic.

It’s a fusion of linguistics, metadata engineering, and database logic which explains why it’s so powerful.


The Promise

The fundamentals of the text-to-SQL are very promising. The marketing managers have no idea about Databases, SQL or coding in general but they want to know answers for:

What were our top-selling products in the US last quarter?🤔

And the AI instantly generates a working SQL query, executes it, and visualizes the results.

No analysts. No dashboards. No waiting.

BOOM! They whole Data Department is no longer needed! Managers are happy, less paychecks to achieve better results. More profit! 💰

In theory, Text-to-SQL could eliminate the bottleneck between business questions and data insights. But in reality, what these systems generate is often syntactically correct yet semantically wrong.


The Real Problem Isn’t SQL! It’s Context

SQL requires contextual understanding of the underlying data model.

That’s exactly where every Text-to-SQL system breaks.

Even the most advanced LLMs can’t know what “active customer” or “net revenue” means inside your company unless you explicitly tell them. You can feed the model every table name, column, and data type, but metadata isn’t meaning.

To generate a correct query, an AI needs to understand:

  • The relationships between tables.
  • The business logic behind calculated fields.
  • Which filters apply in which contexts.
  • What data each user is allowed to access.

That’s not something you can simply “prompt in.” This why nowadays The term of Data Governance is everywhere.

Article content
What is Data Governance

The Context Problem

Recent advances like the Model Context Protocol (MCP) aim to close this gap.

They allow LLMs to connect to your metadata catalog and automatically retrieving table schemas, lineage, and column descriptions. It’s a huge step forward.

But even with MCP, the model doesn’t “understand” your data the way an analyst does. It just has more structured context. The meaning the why still has to come from humans.

Article content
MCP Integration. Font: www.descope.com

Guardrails Are the Real Game-Changer

A more pragmatic approach comes from using a guard-railed architecture:

  • Define a set of approved query templates.
  • Let the AI fill only the parameters (date ranges, categories, filters).
  • Validate everything before execution.

This hybrid model keeps flexibility while protecting data integrity. It doesn’t replace the analyst, it extends their capabilities.

The AI handles repetitive querying, while the analyst interprets, validates, and communicates insights.


Will text-to-SQL It Replace Data Analysts?

Not yet... and maybe not ever!

LLMs can generate queries. They can’t generate understanding.

A Text-to-SQL model doesn’t know when data looks wrong, or when a trend doesn’t make sense. It doesn’t ask follow-up questions or challenge assumptions.

The real value of a data analyst isn’t typing SQL, it’s connecting business questions to reliable, contextual answers.

What will happen is a shift:

  • Analysts who rely only on manual querying will fall behind.
  • Those who use AI to move faster, explore broader, and validate smarter... Will thrive!


Text-to-SQL won’t replace data analysts, but analysts who use Text-to-SQL might replace those who don’t.

Long live SQL and Data Analysts! 👑

If you enjoyed this read, please give it a like so more people can discover it!

Don't forget to subscribe to The Data Path so you don't miss the latest trends in Data!

Best regards,

José Siles Data Engineer at Nestlé

Not yet... and maybe not ever! 😀 👏

Like
Reply

AI at the moment is more of a Autistic Intelligence, works fine in a narrow space. Did a PoC, training a model on strict rules, with complex statistical analysis that were coded as multi variable functions, which were then exposed to the model. Worked fine! Generic models that are fed on data, without directions, rules or any other means of control are just going to guess, models trained on rules and methodology don't guess! Ergo: One size does not fit all and in the current "shop", most sizes aren't available.

LOL because self-serve BI really replaced analysts years ago ...

To view or add a comment, sign in

More articles by José Siles

  • I built my own Spotify Wrapper 2025

    Hi everyone! 👋🏼 Welcome to the edition #19 of The Data Path! I want to take a moment to say THANK YOU🙏🏼. Whether…

    35 Comments
  • Is AI killing Data Engineering?

    Hi everyone, welcome to edition #18 of The Data Path!👋🏼 I get this question quite a lot: Should I still get into Data…

    33 Comments
  • SQL Interview Question: COUNT()

    Hi everyone! 👋🏼 Welcome to the edition #17 of The Data Path! I want to take a moment to say THANK YOU🙏🏼. Whether…

    38 Comments
  • Databricks vs Snowflake: Which One to Pick

    Hi everyone, welcome to edition #16 of The Data Path! 👋🏼 If you work in data, you already know that Snowflake and…

    59 Comments
  • Data Engineer Roadmap 2026

    Hi everyone! 👋🏼 Welcome to the edition #15 of The Data Path! Before we jump into the ultimate Data Engineer Roadmap…

    29 Comments
  • AI Automation vs AI Agents. What is the difference?

    Hi everyone, welcome to the edition number #14 of The Data Path!👋🏼 Everyone says they’re using Artificial…

    2 Comments
  • How Much Python Do Data Engineers Really Need?

    Hi everyone, welcome to the edition number #13 of The Data Path!👋🏼 Are you suspicious? In my culture the number 13 is…

    19 Comments
  • Become a Data Engineer without a CS degree

    Hi everyone 👋, welcome to a new edition of The Data Path! We’re already in October, which means you still have three…

    20 Comments
  • How to survive the Data Engineering Interview

    Hi reader of The Data Path! 👋 Landing a Data Engineering job is hard. But let’s be honest… the interview process is…

    5 Comments
  • SCD 2: The Secret to Keeping Historical Data

    Hi Data Pathers! 👋 This week, I had to stop for a second and smile: The Data Path has officially passed 3,000…

    34 Comments

Others also viewed

Explore content categories