We needed a Dictionary/Hashmap for an algorithm. One small change made it 3x faster.
we our team working on an algorithm that needed fast lookups.
Simple choice , use a Dictionary.
“It’s O(1), right?”
var dict = new Dictionary<string, Order>();
foreach (var order in ordersList) // 50,000 items
{
dict.Add(order.Id, order);
}
But here’s the thing , O(1) isn’t the full story.
Let me break down what’s actually happening inside.
The Ideal World
Imagine if a dictionary worked like this:
key → hash → that hash is the memory address → done
Your key hashes to a number. That number is exactly where your data lives in memory.
No searching. No collisions. True O(1).
But there’s a problem.
Hash functions return 32-bit, 64-bit, or more depending on the language. Even at 32-bit, that's 4 billion possible values.
If we reserve one memory slot for every possible hash, we’d need 32 GB of RAM just to create an empty dictionary.
And you might only store 100 items.
So instead, dictionaries do something clever.
The Real World: A Smaller Array
Rather than 4 billion slots, a dictionary allocates a small array , let’s say 16 slots.
This array is called buckets. Think of it like 16 boxes where your data can live.
Now we need to squeeze our huge hash into this small space:
hash % bucketCount = which bucket to use
hash: 1,894,726,153
buckets: 16
index: 1,894,726,153 % 16 = 9
Simple division. Remainder tells us the bucket.
16 slots instead of 4 billion. Memory problem solved.(In practice, .NET and Java use bit masking instead of modulo when buckets are powers of two faster, same result.)
But we’ve created a new problem.
When you squeeze 4 billion possibilities into 16 buckets, multiple keys will land in the same bucket.
This is called a collision.
Collisions: Two Keys, One Bucket
Two ways to handle this:
Recommended by LinkedIn
Chaining — each bucket holds a linked list. Colliding items chain together.
Open Addressing — bucket taken? Keep probing forward until you find an empty one.
Either way, you’re now walking through items to find the right one.
Few collisions ? Still fast.
All keys in one bucket ? That’s O(n) linear search.
The Hidden Cost: Resize
This is where performance silently dies.
Dictionaries start small , just a few buckets.
As items grow, they track a load factor (items ÷ buckets).
When it crosses ~70–75%(differs on languages), the structure needs more room. It resizes:
1. Allocate new array (roughly 2x bigger or in primary numbers)
2. Loop through EVERY existing item
3. Recalculate each: hash % newBucketCount
4. Reinsert into new positions
One .Add() just touched every item in the collection.
50,000 items without pre-sizing:
Start: 3 buckets
Resize 1: 7 buckets → rehash all
Resize 2: 17 buckets → rehash all
Resize 3: 37 buckets → rehash all
...
Resize 15: 75,000+ buckets → rehash all
Could be 10-15 resizes depending on implementation. Each one rehashes everything. Tens of thousands of wasted operations.
With pre-sizing:
var dict = new Dictionary<string, Order>(capacity: 50000);
Zero resizes. 3x faster.
The Real Complexity
Scenario Time
Average case O(1)
Many collisions O(n)
Resize triggered O(n)
Bad hash function O(n)
Simple Takeaway
If you know the size . tell the dictionary.
C#: new Dictionary<K,V>(capacity: count)
Java: new HashMap<>(count)
If you don’t know the cost, you’re already paying it.