Stop Scanning, Start Navigating: Why Graph-Based Indexing is Important for Information Retrieval

Dipanjan Chowdhury

Published Jan 27, 2026

Vector search is a navigation problem, not a calculation problem.

What is HSNW (Hierarchical Navigable Small World)?

It is a type of graph-based indexing algorithm which builds multi-layer graph that balances speed and search quality even at large sizes. It combines two worlds:

Probability Skip Lists (for hierarchy) : Creating Hierarchical Layers like high, upper, and lower layers for the node to be placed.

Article content — Skip List; source: supabase

2. Navigable Small Worlds ( for graph connectivity) : Creating Bidirectional Layers for enabling neighbourhood based searches.

Note: The search within the layers is purely graph based, ensuring any point can reach any other point within very few hops (Small World Concept in graph Theory).

The current bottleneck we face in Information Retrieval:

Let's take an analogy to understand the problem

Imagine trying to find the best coffee shop in a massive city.

The most naive way would be to walk every single street, check every cafe, and then decide which one is best (Flat Search/KNN). It would give you the perfect answer, but it would take forever and is completely impractical.

Note: And to keep in mind there's always a tradeoff between performance(with respect to latency) and accuracy.

How HSNW helps in this scenario?

You don’t start by checking every street. Instead, you first look at a city-level map to get into the right neighbourhood. Then you switch to local roads, and finally walk block by block to find the exact spot.

Recommended by LinkedIn

From Data Graveyard to Intelligence Engine: The Local…

Veronica Lange 2 months ago

TT#18: "Tech talk on GeoHashing"

Satyam Barsainya 1 year ago

Little thinking on problem 'Searching an Element in an…

Harshit Agrawal 7 years ago

How HSNW works?

Multi-Layer Graphs (The Zoom Effect/City Level Map):

In HNSW, data isn't just one flat list; it’s a multi-storey skyscraper of graphs.

The Top Layers (The "Satellite View"): This is your high-level map. It only contains a few major landmarks. You use these to make massive "hops" across the data landscape.
The Middle Layers (The "Neighbourhood View"): Once you're in the right city, you drop down a layer to find the right district using local roads.
The Bottom Layer (The "Sidewalk View"): This contains every single data point. Here, you perform the final, precise search to find the exact cafe.

2. Greedy Search (The "Next Cafe" Logic):

HNSW doesn't explore the whole city at once; it acts like a coffee lover who only cares about the next best cafe spot.

The Step: At every node (intersection), it looks at its neighbours and asks: "Which of these brings me closer to my destination?"
The Hop: It immediately moves to the closest neighbour.
The Result: Because the graph is built for navigation, you are always moving "warmer" toward the target. You don't waste time in dead-ends or cafe shops which doesn't have coffee.

3. Six Degrees of Separation (The "Highway" Secret)

Why is this faster than checking a list? Because of Shortcuts.

Small World Connectivity: In graph theory, a "Small World" is a network where most nodes aren't direct neighbours, but any two nodes are connected by a very short chain of edges/relationships.
The Power of Hops: Just as a cafes connects two coffee houses directly by providing information they serve coffees, HNSW creates "long-range" edges between distant nodes, that is between streets within cities which serve coffee.
The Efficiency: Even if you have a billion data points, you can usually find your target in fewer than 50–100 hops. You aren't searching through a billion items; you're navigating through a few dozen connections.

Constructive Outlook:

Can we use Graph Database to utilise the HNSW Concept (Available in most of the Vector Databases)?

While HSNW is a ANN search method it helps to store the Vector Indexes effectively and not represent semantics, i.e. how we find nearest vectors as with a balanced approach between latency and accuracy.

Whereas, Graph Databases helps in capturing the semantical meaning between it's relationships and nodes, i.e. how are X and Y connected?

Recently, graph databases have begun providing vector indexes as an add-on. While these indexes return the top-K nodes for a graph engine to traverse, this approach introduces significant overhead between the two layers. This becomes a critical bottleneck when scaling to billions of nodes and relationships where low latency is crucial.

Additional Reads:

Rajdeep Mondal 3mo

Treating vector search as a first-class graph operator, not a handoff, is the real unlock here. The edge case I’d watch is update pressure. If embeddings and relationships change often, HNSW maintenance and consistency semantics can become the bottleneck. How are you thinking about online index updates and read consistency when vectors live as native properties on nodes and relationships?

1 Reaction

Daniel Svonava 3mo

The "context-switching latency" framing is interesting but how much are we actually talking about - single digit milliseconds or something worse? At billion scale the bottleneck might be elsewhere entirely.

1 Reaction

Jochen Schultz 3mo

checkout TuringDB as well.. similar approach there.. coupled with faiss...

2 Reactions

Phil Nadeau 3mo

This reminds me of cartesian tiers but for graph indices.

1 Reaction

adrian garrigos 3mo

YAOUTCHI! That's Huge! Because i was thinking about the switch between #Neao4J and #FalkorDB for the exact same raison.. And Falkor is using matrices natively right?

Stop Scanning, Start Navigating: Why Graph-Based Indexing is Important for Information Retrieval

Dipanjan Chowdhury

What is HSNW (Hierarchical Navigable Small World)?

The current bottleneck we face in Information Retrieval:

Recommended by LinkedIn

How HSNW works?

Constructive Outlook:

More articles by Dipanjan Chowdhury

Others also viewed

Tightening RAG Retrieval with a Simple Knowledge Graph

Algorithms for Work: Array - Rules and Ideas

Vector less RAG

DSA Diaries: The Efficiency of Binary Search

Why RAG Systems Need Re-ranking?

Is your company doing data science, or #DataScience?

A Practitioner's Guide to Building Robust RAG Pipelines

RAG vs CAG

HNSW: The Algorithm Quietly Deciding What Your LLM Sees as "Truth"

Presenting New Text Analytic Tools at OpenBelgium 2017 - Smart Cities

Explore content categories

What is HSNW (Hierarchical Navigable Small World)?

The current bottleneck we face in Information Retrieval:

Recommended by LinkedIn

How HSNW works?

Constructive Outlook:

More articles by Dipanjan Chowdhury

RDF VS RDF Star

Semantic Showdown: GraphRAG vs Graphiti in the Race for Intelligent Memory

Open'ed'VINO

Others also viewed

Tightening RAG Retrieval with a Simple Knowledge Graph

Algorithms for Work: Array - Rules and Ideas

Vector less RAG

DSA Diaries: The Efficiency of Binary Search

Why RAG Systems Need Re-ranking?

Is your company doing data science, or #DataScience?

A Practitioner's Guide to Building Robust RAG Pipelines

RAG vs CAG

HNSW: The Algorithm Quietly Deciding What Your LLM Sees as "Truth"

Presenting New Text Analytic Tools at OpenBelgium 2017 - Smart Cities

Explore content categories