When Azure SQL Becomes a Lookup Engine: How We Fixed a Silent Architecture Failure

When Azure SQL Becomes a Lookup Engine: How We Fixed a Silent Architecture Failure

Most Azure SQL performance problems are not solved by scaling SQL. They’re solved by removing the work SQL should never be doing.

At Khoj, we ran into this the hard way.

Our Azure SQL S12 workloads were “healthy” on paper but in reality we were hitting a wall:

  • Storage was approaching the 1 TB limit
  • DTUs and CPU were spiking
  • Logic Apps were timing out
  • Every scale-up meant higher cost with diminishing returns

The problem wasn’t bad queries. It was a read-heavy enterprise workload using a transactional database as a lookup engine.

That’s where most cloud architectures quietly break.

So instead of asking “How do we make SQL faster?” we asked something more uncomfortable:

“Why is SQL doing this work at all?”

The architectural shift

We introduced Azure Cache for Redis as a read layer in front of Azure SQL.

Not as a generic cache but as a purpose-built lookup engine.

We identified exactly what was hurting us most:

  • Cross-reference (XREF) data
  • Reference and configuration lookups
  • Highly repetitive API calls

These were:

  • Predictable
  • Read-heavy
  • Rarely changing
  • And extremely expensive when run through SQL at scale

So we cached the full API response for every unique lookup key in Redis with a 30-day TTL -matching the real business change frequency.

Now the flow looks like this:

Client → API Management → Azure Function → Redis → (SQL only on cache miss)

SQL stays the system of record. Redis becomes the system of speed.

Why this actually worked

Caching alone isn’t enough - cache misses still hit SQL.

So we partitioned the underlying SQL data model, eliminating large scans and making fallback queries fast and predictable.

The result:

  • Redis absorbs the hot traffic
  • Partitioned SQL handles safe misses
  • Neither layer is overloaded

That’s what makes this design scale.

What changed

We didn’t get “slightly faster.” We changed the operating model of the system.

  • SQL CPU dropped sharply
  • API latency on user-facing paths improved
  • Logic Apps stopped timing out
  • Peak throughput increased
  • vCore requirements went down (real cost savings)

Redis didn’t make SQL faster. It made SQL less busy.

And that’s the difference between scaling up and scaling properly.

If your Azure SQL costs keep rising while performance stays fragile, ask yourself this:

What percentage of your workload is actually transactional and how much of it is just expensive lookup traffic wearing a SQL disguise?

This is a great example of intentional architecture rather than reactive optimisation. Treating Redis as part of the read-path design — not just a performance patch — and pairing it with SQL partitioning shows real systems thinking. Cache misses are inevitable; designing for predictable misses is what separates scalable platforms from fragile ones. Well articulated architectural shift by the Khoj Information Technology, Inc. team. 👏

Like
Reply

Nice architecture, key benifits are really amazed like data access performance from memory is exceptionally faster, reduce the load from primary SQL server by serving most read request from memory. Superb work by team.

Like
Reply

Nice example of how Khoj Information Technology, Inc. looks at performance problems holistically, not just through scaling. Practical and well thought out approach.

Redis isn’t just a cache here; it’s the accelerator that makes Azure SQL feel limitless. By caching hot queries and session data, it drastically reduces DTU load, cuts latency, and boosts throughput. 

Really smart to treat Redis as a foundational layer, not just a speed boost. Designing for predictable performance, even in the face of cache misses, demonstrates a deep understanding of real-world system behavior. Excellent work by the Khoj Team. 👍

To view or add a comment, sign in

More articles by Khoj Information Technology, Inc.

Others also viewed

Explore content categories