Cortex Search

Cortex Search

Cortex is the AI capability in Snowflake. Of all the Cortex features, Cortex Search is probably the least well known. The other Cortex features are more well known, like Cortex Analyst, Cortex AI Functions, Cortex Agent, Cortex MCP Server, Cortex REST API, Cortex AI Documents, Snowflake Intelligence, and the most famous of them all: Cortex Code, which hits all the news recently.

Why not many people have used Cortex Code in anger? Because it’s for searching text. And you don’t work with text data. You work with numbers. Perhaps because you work in financial institutions. Or in retail, analysing sales data. Or in telecom, working with billing data. Traditionally BI/reporting/dashboarding/analytics are about numbers. Not text.

But hey, it is the era of LLM now. Everywhere you look, you’re seeing LLM. Chatbots. Your banks and insurers use chatbot. When you’re booking your travel you’re using chatbots. Your doctor and power company are using chatbots. And in the office, yep you’re using LLM too such as in Teams and Copilot.

Cortex Search is the perfect RAG engine for LLM chatbots.

This is super, mega important so let me repeat that:

Cortex Search is the perfect RAG engine for LLM chatbots.

You can use Cortex Search as a RAG engine for chat applications. Searching your text data. Using semantic search.

RAG means your search is grounded. It means that your search is contextualised.

AI 101

Right. Let’s do a quick AI 101 here.

What is RAG (Retrieval Augmented Generation)?

RAG is a technique to retrieve data from a knowledge base. Why? To enhance the general response of an LLM.

Article content

In the above diagram, on the right hand side you’re calling an LLM. Like Claude, Llama, Gemini and the famous OpenAI GPT. In Snowflake you call LLM using the Complete() function. It’s called AI_COMPLETE: link.

It’s really simple to use this function. You just need to specify your prompt, and the LLM model you want to use. Optionally you can specify the LLM parameters too.

Like this:

SELECT AI_COMPLETE
(   model => 'claude-sonnet-4-6',
    prompt => 'How does LLM work?',
    model_parameters => {'temperature': 0.7, 'max_tokens': 10}
);        

Simple isn’t it?

Right, that is AI 101. What is RAG, and how to call an LLM.

Article content

Ah, sorry I have not explained what RAG is. RAG is about grounding the response of the LLM, by providing the LLM with the context documents.

In the middle part the above diagram you can see “context documents”. And we produce those context documents using Cortex Search.

In the middle part of the above diagram can you see an arrow with a + sign? It says “+ prompt”. It means that the prompt is enriched with the context documents.

So the LLM (the AI_COMPLETE function) doesn’t use only prompt as its sole input. The LLM uses the prompt + the context documents. That is what “grounded” means.

Right, that is AI 101. What is RAG, and how to call an LLM. And what grounded means. What contextualised means.

Vectors and Keywords

Article content

Now let’s see the left part of the above diagram. Can you see the Cortex Search box with Vectors and Keywords in it? It means that Cortex Search can do search using vectors. And it can also do search using keywords.

What does vector means? In LLM, what is a vector?

In LLM, a vector is a group of numbers that represents a word. For ease of understanding, that group of numbers can be represented with an arrow, like this:

Article content
Image Source: Carnegie Mellon University

Image source: Carnegie Mellon University (link)

In the above diagram, the word “boy” has coordinate (1,2). And the word “man” has coordinate (1,7). The arrow pointing to the word “woman” has coordinate of (9,7).

Those arrows are what vectors are. They are a group of numbers that represents a word.

Searching using vectors means searching using the similarity between the words, like this:

Article content
Image source: Carnegie Mellon University

Image source: Carnegie Mellon University (link)

We can see above 4 red lines showing the similarity between king and prince, man and boy, etc. And that is called semantic, or meaning.

Searching using vector / semantic / meaning means that you search using the “arrow”. Have a look at the 3 words below: grandfather, man and boy:

Article content
Image source: Carnegie Mellon University

The arrows for those 3 words have similar directions. That is what searching using vector means.

Have a look at the image below. The blue dots on at the top right are science and technology. They are located near each other. At the bottom right of the page, the green dots for Sports are located near each other:

Article content
Image source: Primer.ai

Image source: Primer.ai (link)

So in this “semantic space”, words with similar meaning are located near each other.

That is what searching using vector means.

Search using keyword, is the traditional search we all know. For example, we search for the exact word “cab”. Or words containing “cab”. Or words beginning with “cab” or ending with “cab”, etc. Like using Regex or using LIKE in SQL.

Ok. Now we know what “Cortex Search can do search using vectors and using keywords” means.

Now that we have understand all the components, let’s have another look at this diagram:

Article content

And let me repeat the super important thing at the beginning of this article:

Cortex Search is the perfect RAG engine for LLM chatbots.

Cortex Search practically

That’s the concept. But as you know, I don’t like “fluffy cloud” words. When I hear fluffy words like “Cortex Search is the perfect RAG engine for LLM chatbots” I always ask: “What does it means practically?”

Imaging that you have a table like this: (see here for the source)

Article content

And you create a Cortex Search service like this:

Article content

This is what the yellow numbers mean:

  1. You can search the column specified in the ON parameter: listing_text, which combines these 3 columns: summary, description, space.
  2. You can filter the search result using the columns specified in the ATTRIBUTES parameter i.e. room_type and amenities.
  3. This service will be no more than 1 hour behind the source table.
  4. The source table is airbnb_listings.

It means that you can create a UI like below:

Article content

In the above UI you can specify:

  1. The query/prompt: “furnished room with Netflix, twin bed and mini fridge”
  2. The filter: Room Type and Amenities
  3. Limit the result to 5
  4. The search results. The red boxes highlighted the matches to the query/prompt i.e. furnished room, Netflix, twin bed and mini fridge.

The End

So if you have databases or documents containing text, you can do a “grounded” search using Cortex Search. You can search using vectors (semantic) or keywords. Cortex Search is the perfect RAG engine for LLM chatbots.

Keep learning! My LinkedIn articles: link.

Snowflake Docs on Cortex Code: link.

Cortex Code demo: link.

To view or add a comment, sign in

More articles by Vincent Rainardi

  • Unstructured Data - From Conversational Files to Conversational Analytics

    For decades analytics is about tables, numbers and relational databases. It is about structured data, as we call it.

    3 Comments
  • Business Analyst

    Before I was a data architect, I was a data engineer. And before I was a data engineer, I was a business analyst.

    1 Comment
  • CDO and CIO: What's the difference?

    So CIO is Chief Data Officer. And CDO is Chief Data Officer.

  • Snowflake dbt Projects

    How does Snowflake dbt projects look like? It looks like this: Snowflake dbt Projects and Cortex Code On the left you…

  • Stupid Questions

    There is NO such thing as a stupid question. Why? Because asking questions is a good way to get knowledge.

  • The Science of (Data) Migration

    Say you have a data warehouse in SQL Server or Oracle, and you need to migrate it to Snowflake or Databricks. The…

    1 Comment
  • AI-ready data: what does it mean?

    JI am a practical person and when I hear people talking “fluffy cloud” words like “AI-ready data” I always try find out…

  • Interval Data Type

    We all know a data type called Date. And Time.

  • Row Timestamp

    In Snowflake, the Row Timestamp is a column that stores when each row was last updated. It’s a brand new feature, went…

  • Data Architecture function

    What is the point of having a data architecture function in your company? Data architecture is crucial for every…

    1 Comment

Others also viewed

Explore content categories