We needed a Dictionary/Hashmap for an algorithm. One small change made it 3x faster.

Sanjay M E

Published Feb 2, 2026

we our team working on an algorithm that needed fast lookups.

Simple choice , use a Dictionary.

“It’s O(1), right?”

var dict = new Dictionary<string, Order>();
foreach (var order in ordersList)  // 50,000 items
{
    dict.Add(order.Id, order);
}

But here’s the thing , O(1) isn’t the full story.

Let me break down what’s actually happening inside.

The Ideal World

Imagine if a dictionary worked like this:

key → hash → that hash is the memory address → done

Your key hashes to a number. That number is exactly where your data lives in memory.

No searching. No collisions. True O(1).

But there’s a problem.

Hash functions return 32-bit, 64-bit, or more depending on the language. Even at 32-bit, that's 4 billion possible values.

If we reserve one memory slot for every possible hash, we’d need 32 GB of RAM just to create an empty dictionary.

And you might only store 100 items.

So instead, dictionaries do something clever.

The Real World: A Smaller Array

Rather than 4 billion slots, a dictionary allocates a small array , let’s say 16 slots.

This array is called buckets. Think of it like 16 boxes where your data can live.

Now we need to squeeze our huge hash into this small space:

hash % bucketCount = which bucket to use

hash: 1,894,726,153
buckets: 16
index: 1,894,726,153 % 16 = 9

Simple division. Remainder tells us the bucket.

16 slots instead of 4 billion. Memory problem solved.(In practice, .NET and Java use bit masking instead of modulo when buckets are powers of two faster, same result.)

But we’ve created a new problem.

When you squeeze 4 billion possibilities into 16 buckets, multiple keys will land in the same bucket.

This is called a collision.

Collisions: Two Keys, One Bucket

Two ways to handle this:

Recommended by LinkedIn

Meet SHACL, the Next OWL

Kurt Cagle 10 years ago

Using Spring Expression Language (SpEL) to Create a…

Kiarash Shamaei 1 year ago

TOON: When JSON Starts to Feel Too Expensive for LLMs

Rituparna De 5 months ago

Chaining — each bucket holds a linked list. Colliding items chain together.

Open Addressing — bucket taken? Keep probing forward until you find an empty one.

Either way, you’re now walking through items to find the right one.

Few collisions ? Still fast.

All keys in one bucket ? That’s O(n) linear search.

The Hidden Cost: Resize

This is where performance silently dies.

Dictionaries start small , just a few buckets.

As items grow, they track a load factor (items ÷ buckets).

When it crosses ~70–75%(differs on languages), the structure needs more room. It resizes:

1. Allocate new array (roughly 2x bigger or in primary numbers)
2. Loop through EVERY existing item
3. Recalculate each: hash % newBucketCount
4. Reinsert into new positions

One .Add() just touched every item in the collection.

50,000 items without pre-sizing:

Start:     3 buckets
Resize 1:  7 buckets    → rehash all
Resize 2:  17 buckets   → rehash all
Resize 3:  37 buckets   → rehash all
...
Resize 15: 75,000+ buckets → rehash all

Could be 10-15 resizes depending on implementation. Each one rehashes everything. Tens of thousands of wasted operations.

With pre-sizing:

var dict = new Dictionary<string, Order>(capacity: 50000);

Zero resizes. 3x faster.

The Real Complexity

Scenario Time

Average case O(1)

Many collisions O(n)

Resize triggered O(n)

Bad hash function O(n)

Simple Takeaway

If you know the size . tell the dictionary.

C#:     new Dictionary<K,V>(capacity: count)
Java:   new HashMap<>(count)

If you don’t know the cost, you’re already paying it.

To view or add a comment, sign in

We needed a Dictionary/Hashmap for an algorithm. One small change made it 3x faster.

Sanjay M E

The Ideal World

The Real World: A Smaller Array

Collisions: Two Keys, One Bucket

Recommended by LinkedIn

The Hidden Cost: Resize

The Real Complexity

Simple Takeaway

More articles by Sanjay M E

Others also viewed

🧠 How to Prompt or Not to: Because GenAI can’t read minds… Yet!

Death of Dashboards: Natural Language ORMs

The shortest path from Anarchy to Alcoholism

Is JSON Dead? The Emergence of Toon Token-Oriented Object Notation

TOON - Token-Oriented Object Notation: A Smarter Data Format for the AI World

Encoding and Schema Evolution: Adapting to Change

What Are Recursive Language Models?

C++ Core Guidelines: Semantic of Function Parameters and Return Values

Introducing Tooner: Bringing TOON Format to Laravel

...it is coming

Explore content categories

The Ideal World

The Real World: A Smaller Array

Collisions: Two Keys, One Bucket

Recommended by LinkedIn

The Hidden Cost: Resize

The Real Complexity

Simple Takeaway

More articles by Sanjay M E

MCP Is Not Magic: How Models Really Use Tools

Go Always Native, .NET Now Flexible: From IL to JIT to AOT

🚨 Optional Parameters in C#: A “MissingMethod” Lesson Learned

🚀 Vibe Coding with AI – The Good, The Bad, The “Wait… What Did It Just Write?”

🚨 The Fire-and-Forget Trap: Why Task.Run Isn’t as Safe as You Think

Others also viewed

🧠 How to Prompt or Not to: Because GenAI can’t read minds… Yet!

Death of Dashboards: Natural Language ORMs

The shortest path from Anarchy to Alcoholism

Is JSON Dead? The Emergence of Toon Token-Oriented Object Notation

TOON - Token-Oriented Object Notation: A Smarter Data Format for the AI World

Encoding and Schema Evolution: Adapting to Change

What Are Recursive Language Models?

C++ Core Guidelines: Semantic of Function Parameters and Return Values

Introducing Tooner: Bringing TOON Format to Laravel

...it is coming

Explore content categories