Data Agent
If I can only learn 1 vocabulary this year, I would choose “Data Agent”. It is that important.
In the near future all of us in data warehousing and BI will need to use "chat with your data" to query databases and documents. Most of us will be using someone else's software, most probably the from our data platform. Or BI platform.
Some of us will be lucky enough that we need to build our own "chat with your data". The core technology is "text to SQL". Basically it's a SQL builder. The input is user prompt, a few databases, and a semantic model. It is so difficult to get it right, so you need to put the brightest minds to this task.
But there are more to it than just a SQL builder. First of all you need an AI agent called a "planner" that understands the goal, and then creates a plan to achieve that goal.
Secondly you need a SQL verifier that runs the SQL generated by the SQL builder, and check if the output is as expected by the plan. If not then it would notify the SQL builder to generate another SQL, acting like a feedback loop.
Thirdly, you need a pair of agents for the visualisation: a "chart generator" and a "chart intepreter". The former generates the charts based on the data, and the latter interprets the chart and provides a narrative containing the story in the chart.
So recapping, the agents you need to build are:
And the flow diagram between those agents is something like this:
Recommended by LinkedIn
But don’t quote me on the above, as I am no data chat expert. Josh Reini is.
If you are interested in “building data agents” for data chat like above, yesterday Andrew Ng posted this course: https://www.deeplearning.ai/short-courses/building-and-evaluating-data-agents/ In this short course (only 2 hours, and it’s free at the moment) Josh Reini and Anupam Datta explain how to build data agents:
It’s much more than what I describe above. You’ll also learn about:
For those who are following me on LinkedIn, apologies if I bore you by keep saying data chat, data chat, data chat like a broken record. But “chat with your data” will change the whole data warehousing and BI industry and as someone who has been working and writing in that industry since 1996 I feel obliged to inform the DWBI industry.
Here is my article on Snowflake Intelligence: https://www.garudax.id/pulse/snowflake-intelligence-vincent-rainardi-0iqke/ You can see in this article how those agents I mentioned above are thinking and working. It is truly amazing to see it on Snowflake . It only takes 15 minutes to setup @Nick Akincilar’s demo that I mentioned in that article. Do that demo, it is the most amazing thing I’ve ever seen in data warehousing and BI. Truly. You won’t regret it.
My Linkedin articles: https://www.garudax.id/pulse/list-all-my-articles-vincent-rainardi-eohge/ My blog: https://dwbi1.wordpress.com/
#DataChat #DataAgent #AI #AgenticAI #DataWarehouse #BusinessIntelligence Snowflake DeepLearning.AI
And this this possibly the most influential post I read from you this year!!! Thank you for sharing, Vincent, this was totally golden!
Interesting read. It seems in many usecases in future, the value proposition won't solely be performance, cost effectiveness, ability to scale, but more & more how easy is it to use the data and find answers with the advent of AI capabilities. But this also puts more focus on availability of metadata which is generally ignored.
Thanks for sharing Vincent Rainardi What is the difference between "Data Agents" and "MCP Servers" that provide similar capabilities? How is it different from what was previously "AI Assistants"? I have come across the term "Data Agent" on Snowflake and MS Fabric but I get the impression there is a inadvertent/deliberate narrowing of the scope in their offerings... Setting aside the definitional question and focusing on capability, to badge a component data agent, I would expect a data agent to provide the following capabilities: - automating common data dasks e.g. data preparation tasks such as cleaning messy data, removing duplicates, and handling missing values. - data exploration & analysis i.e. autonomously exploring datasets to identify trends, patterns, anomalies, and correlations. - multi-source connectivity to retrieve data from a variety of data sources including databases, spreadsheets, file, APIs, etc. - insight generation which extends beyond providing raw data to generating summarisation, creating visualisations, and commentary. We can even extend the above to Agentic Data Management, which is a opens up a whole set of possibilities... What's your take?
💡 A data agent can become a powerful new interface for data engineers to interact with the data platform. But it will quickly hit its limits — every organization has unique data, systems, and stakeholders. What we need is an adaptive framework that allows us to work with AI effectively. I believe the future of data lies in finding the right balance between: • Business domain knowledge • Data expertise • AI capabilities To capture this intersection, I call the emerging role: AI Insights Orchestrator. Excited to share more on this idea in the future!
All AI agents rely on data to perform their tasks, but data agents are a specific type of agent that specializes in the data domain, primarily focusing on data retrieval, querying, and management tasks, such as extracting information from vast datasets using tools like SQL scripts.