From SQL Queries to Conversational Analytics: My Journey with LLMs, RAG, and Data Agents

Yashar Bazaz Rahimi

Published Sep 9, 2025

Body:

As data professionals, we spend countless hours writing SQL queries, debugging joins, and preparing dashboards. But what if we could change the way we interact with data entirely?

Recently, I built an LLM-powered agent that connects directly to our SQL databases (SQL Server & ClickHouse) and uses Retrieval-Augmented Generation (RAG) on top of schema documentation. The goal was simple: allow anyone in the company to ask questions in plain language — and get answers, visualizations, and insights without writing SQL.

The Challenges We Face

Business users depend on analysts for even simple questions.
Analysts lose valuable time writing repetitive queries and producing ad-hoc reports.
Data access and security remain a constant concern.
Onboarding new colleagues into complex data models is not easy.

My Approach

I built an LLM Agent connected to our SQL Server and ClickHouse databases, with RAG on top of markdown documentation of our schema.

Key features:

Conversational Queries: Anyone can ask a question, no SQL knowledge required.
Smart Visualizations: Results are automatically visualized with plots, charts, or tables.
Error Learning: The system remembers failed queries and avoids repeating mistakes.
User Feedback: People can add feedback (e.g. “this result is incomplete”), and the agent learns from it for future queries.
Security: Since the LLM runs locally, data never leaves our environment.
Access Control: Different teams get different schema files in RAG, so everyone only sees what they should.
Shared Understanding: All metrics and definitions are aligned through the schema documentation, reducing inconsistencies across teams.
Onboarding: New colleagues can “chat” with the data to learn faster.

The Solution

I combined LLMs, RAG, and SQL execution into a single workflow:

Schema files in Markdown describe our tables, KPIs, and definitions.
These files are embedded into a vector database (Chroma + SentenceTransformer).
An agent retrieves relevant schema context and generates SQL queries.
Queries are executed directly on SQL Server with auto-fix logic if errors occur.
Results are visualized automatically with Plotly.
Users can provide feedback (\Feedback …) and the system learns from it for future queries.

How It Works (Code Snippets)

1. Schema + RAG + Create Vector DB I start by loading schema docs and building a vector index:

def load_schema(file_path):
    with open(file_path, "r", encoding="utf-8") as f:
        return f.read()

def build_vector_db(schema_text):
    sections = schema_text.split("## Table:")
    model = SentenceTransformer("all-MiniLM-L6-v2")

    client = chromadb.Client()
    try:
        collection = client.delete_collection("schema")
    except Exception as e:
        print(e)
    collection = client.get_or_create_collection("schema")

    for i, sec in enumerate(sections):
        sec = sec.strip()
        if sec:
            emb = model.encode(sec).tolist()
            collection.add(documents=[sec], embeddings=[emb], ids=[f"sec_{i}"])
    return collection, model

2. SQL Generation + Auto-Fix When a query fails, the system retries with error context:

Recommended by LinkedIn

SQL’s Order of Execution

Lasha Dolenjashvili 2 years ago

How to query your SQL data with AI

Ward Greunke 1 year ago

🚀 Day 47 of 60 Days of SQL: Indexing Strategies – The…

Anushka Bandara 5 months ago

# The agent attempts to fix the SQL query on failure
for attempt in range(1, max_retries + 1):
    try:
        rows, cols = run_sql(sql, db_config)
        return sql, rows, cols
    except Exception as e:
        last_error = str(e)
        log_sql_error(sql, last_error)
        corrected = fix_sql_with_error(
            ...
            error_text=last_error,
            ...
        )
        if not corrected:
            continue
        sql = corrected

3. User Feedback Users can guide future answers with feedback:

if q.startswith("\\Feedback"):
    fb_text = q.replace("\\Feedback", "").strip()
    log_feedback(fb_text)
    print("Feedback saved:", fb_text)

4. LLM agent Built with a local LLM via Ollama and a self-correcting logic, this agent can translate natural language questions into valid SQL Server queries.:

# URL for the local Ollama instance
OLLAMA_URL = "http://localhost:11434/api/generate"

def ask_llm(prompt, model="gpt-oss:20b"):
    """
    Sends a prompt to the specified LLM via Ollama and returns the response.
    Includes previous user feedback in the prompt to improve context.
    """
    # Add previous feedback to all prompts
    prompt += get_feedback_context()
    payload = {"model": model, "prompt": prompt, "stream": False}
    resp = requests.post(OLLAMA_URL, json=payload)
    resp.raise_for_status()
    return resp.json()["response"].strip()

5. Visualization Every result is automatically plotted:

def visualize_data(rows, cols, chart_type):
    if not rows or not cols:
        print('visualize_data, rows and columns:\n',rows, cols)
        print("No data to visualize")
        return
    df = pd.DataFrame.from_records(rows, columns=cols)
    print("df visualizetion is:\n", df)
    fig = None
    if chart_type == "line":
        fig = px.line(df, x=cols[0], y=cols[1:], markers=True)
    elif chart_type == "bar":
        if len(cols) >= 2:
            fig = px.bar(df, x=cols[0], y=cols[1])
    elif chart_type == "scatter":
        if len(cols) >= 2:
            fig = px.scatter(df, x=cols[0], y=cols[1])
    elif chart_type == "pie":
        if len(cols) >= 2:
            fig = px.pie(df, names=cols[0], values=cols[1])
    else:
        if any(("date" in c.lower()) or ("time" in c.lower()) for c in cols):
            fig = px.line(df, x=cols[0], y=cols[1:], markers=True)
        elif len(cols) == 2:
            fig = px.bar(df, x=cols[0], y=cols[1])
        elif len(cols) >= 3:
            fig = px.scatter(df, x=cols[0], y=cols[1], size=cols[2], color=cols[1])

    if fig:
        fig.show()
    else:
        print('visualize_data, rows and columns:\n',cols[0], cols[1:])
        print("Could not generate visualization")

Why This Matters

For business users: they can get answers instantly without waiting for the data team.
For data teams: less time on repetitive SQL, more time on complex analysis and strategy.
For managers: faster decision-making with reliable, consistent data.
For the company: security (local LLM, no data leaves our environment) and access control (different schema files for different teams).

The Bigger Picture

This project is not just about speeding up reports. It’s about reshaping how companies interact with their data — making insights accessible, secure, and aligned across teams.

It also opens the door to:

Easier onboarding for new colleagues (they can “chat” with the data).
A shared source of truth through schema documentation.
Continuous improvement via error learning and user feedback.

Closing:

This journey taught me that AI is not here to replace analysts — it’s here to empower everyone to ask better questions and focus on higher-value work.

Have you experimented with LLMs for data analysis in your company? What opportunities or challenges do you see in combining AI with BI?

#Python #LLM #Ollama #SQL #DataScience #AI #GenerativeAI #AutoFix #ChromaDB

Parham Abbasi 7mo

Great job dear Yashar! It is so inspiring. I will be grateful if you provide a brief information about the computational resources required for such LLM-backed data analysis projects.

1 Reaction

Pargol Parish 7mo

So incredible Yashar Jan 👏🏻👏🏻

1 Reaction

Amir Noorani 7mo

That's Amazing. Writing SQL sentences is very hard for me.

1 Reaction

See more comments

To view or add a comment, sign in

From SQL Queries to Conversational Analytics: My Journey with LLMs, RAG, and Data Agents

Yashar Bazaz Rahimi

The Challenges We Face

My Approach

The Solution

How It Works (Code Snippets)

Recommended by LinkedIn

Why This Matters

The Bigger Picture

Others also viewed

Leveling Up with Advanced SQL and Data Modeling

RAG Prototype #3 - SQL Data

RAG Prototype #3 - SQL Data

Most Analysts Overlook This Key SQL Insight

SQL for Data Retrieval: Mastering the Language Behind Every Database

Stored Procedures vs. Functions: Best Practices for Development and Maintenance in SQL

dbt Fundamentals: From Seeds to Tests to Snapshots (A Practical Guide)

Semantic Is All You Need: Why LLMs Fail at Enterprise SQL (And How a Metadata Layer Fixes It)

Essential SQL Queries for Data Analysts: Practical Examples and Use Cases

Transitioning from SQL to KQL, A quick onboarding guide with comparison of syntax, tips and tricks.

How to Use RAG Architecture for Better Information Retrieval

Solving LLM Onboarding and Developer Tooling Challenges

How Llms Process Language

How LLMs Generate Data-Rich Predictions

Applying LLM Agents in Scientific Data Management

Explore content categories

The Challenges We Face

My Approach

The Solution

How It Works (Code Snippets)

Recommended by LinkedIn

Why This Matters

The Bigger Picture

Others also viewed

Leveling Up with Advanced SQL and Data Modeling

RAG Prototype #3 - SQL Data

RAG Prototype #3 - SQL Data

Most Analysts Overlook This Key SQL Insight

SQL for Data Retrieval: Mastering the Language Behind Every Database

Stored Procedures vs. Functions: Best Practices for Development and Maintenance in SQL

dbt Fundamentals: From Seeds to Tests to Snapshots (A Practical Guide)

Semantic Is All You Need: Why LLMs Fail at Enterprise SQL (And How a Metadata Layer Fixes It)

Essential SQL Queries for Data Analysts: Practical Examples and Use Cases

Transitioning from SQL to KQL, A quick onboarding guide with comparison of syntax, tips and tricks.

Similar topics

How to Use RAG Architecture for Better Information Retrieval

Solving LLM Onboarding and Developer Tooling Challenges

How Llms Process Language

How LLMs Generate Data-Rich Predictions

Applying LLM Agents in Scientific Data Management

Explore content categories