Cortex Search
Cortex is the AI capability in Snowflake. Of all the Cortex features, Cortex Search is probably the least well known. The other Cortex features are more well known, like Cortex Analyst, Cortex AI Functions, Cortex Agent, Cortex MCP Server, Cortex REST API, Cortex AI Documents, Snowflake Intelligence, and the most famous of them all: Cortex Code, which hits all the news recently.
Why not many people have used Cortex Code in anger? Because it’s for searching text. And you don’t work with text data. You work with numbers. Perhaps because you work in financial institutions. Or in retail, analysing sales data. Or in telecom, working with billing data. Traditionally BI/reporting/dashboarding/analytics are about numbers. Not text.
But hey, it is the era of LLM now. Everywhere you look, you’re seeing LLM. Chatbots. Your banks and insurers use chatbot. When you’re booking your travel you’re using chatbots. Your doctor and power company are using chatbots. And in the office, yep you’re using LLM too such as in Teams and Copilot.
Cortex Search is the perfect RAG engine for LLM chatbots.
This is super, mega important so let me repeat that:
Cortex Search is the perfect RAG engine for LLM chatbots.
You can use Cortex Search as a RAG engine for chat applications. Searching your text data. Using semantic search.
RAG means your search is grounded. It means that your search is contextualised.
AI 101
Right. Let’s do a quick AI 101 here.
What is RAG (Retrieval Augmented Generation)?
RAG is a technique to retrieve data from a knowledge base. Why? To enhance the general response of an LLM.
In the above diagram, on the right hand side you’re calling an LLM. Like Claude, Llama, Gemini and the famous OpenAI GPT. In Snowflake you call LLM using the Complete() function. It’s called AI_COMPLETE: link.
It’s really simple to use this function. You just need to specify your prompt, and the LLM model you want to use. Optionally you can specify the LLM parameters too.
Like this:
SELECT AI_COMPLETE
( model => 'claude-sonnet-4-6',
prompt => 'How does LLM work?',
model_parameters => {'temperature': 0.7, 'max_tokens': 10}
);
Simple isn’t it?
Right, that is AI 101. What is RAG, and how to call an LLM.
Ah, sorry I have not explained what RAG is. RAG is about grounding the response of the LLM, by providing the LLM with the context documents.
In the middle part the above diagram you can see “context documents”. And we produce those context documents using Cortex Search.
In the middle part of the above diagram can you see an arrow with a + sign? It says “+ prompt”. It means that the prompt is enriched with the context documents.
So the LLM (the AI_COMPLETE function) doesn’t use only prompt as its sole input. The LLM uses the prompt + the context documents. That is what “grounded” means.
Right, that is AI 101. What is RAG, and how to call an LLM. And what grounded means. What contextualised means.
Vectors and Keywords
Now let’s see the left part of the above diagram. Can you see the Cortex Search box with Vectors and Keywords in it? It means that Cortex Search can do search using vectors. And it can also do search using keywords.
What does vector means? In LLM, what is a vector?
In LLM, a vector is a group of numbers that represents a word. For ease of understanding, that group of numbers can be represented with an arrow, like this:
Image source: Carnegie Mellon University (link)
In the above diagram, the word “boy” has coordinate (1,2). And the word “man” has coordinate (1,7). The arrow pointing to the word “woman” has coordinate of (9,7).
Those arrows are what vectors are. They are a group of numbers that represents a word.
Searching using vectors means searching using the similarity between the words, like this:
Recommended by LinkedIn
Image source: Carnegie Mellon University (link)
We can see above 4 red lines showing the similarity between king and prince, man and boy, etc. And that is called semantic, or meaning.
Searching using vector / semantic / meaning means that you search using the “arrow”. Have a look at the 3 words below: grandfather, man and boy:
The arrows for those 3 words have similar directions. That is what searching using vector means.
Have a look at the image below. The blue dots on at the top right are science and technology. They are located near each other. At the bottom right of the page, the green dots for Sports are located near each other:
So in this “semantic space”, words with similar meaning are located near each other.
That is what searching using vector means.
Search using keyword, is the traditional search we all know. For example, we search for the exact word “cab”. Or words containing “cab”. Or words beginning with “cab” or ending with “cab”, etc. Like using Regex or using LIKE in SQL.
Ok. Now we know what “Cortex Search can do search using vectors and using keywords” means.
Now that we have understand all the components, let’s have another look at this diagram:
And let me repeat the super important thing at the beginning of this article:
Cortex Search is the perfect RAG engine for LLM chatbots.
Cortex Search practically
That’s the concept. But as you know, I don’t like “fluffy cloud” words. When I hear fluffy words like “Cortex Search is the perfect RAG engine for LLM chatbots” I always ask: “What does it means practically?”
Imaging that you have a table like this: (see here for the source)
And you create a Cortex Search service like this:
This is what the yellow numbers mean:
It means that you can create a UI like below:
In the above UI you can specify:
The End
So if you have databases or documents containing text, you can do a “grounded” search using Cortex Search. You can search using vectors (semantic) or keywords. Cortex Search is the perfect RAG engine for LLM chatbots.
Keep learning! My LinkedIn articles: link.
Snowflake Docs on Cortex Code: link.
Cortex Code demo: link.