Thundering Herd Problem (When Everything Breaks at Once):- A caching layer to reduce database load for frequently accessed data. --- Problem I faced: Everything worked well… until cache expired. Suddenly: Huge spike in database queries CPU usage shot up API latency increased System became unstable All at the same moment. --- How I fixed it:- This was the Thundering Herd Problem. When cache expired, multiple requests tried to fetch fresh data simultaneously. Fixes applied: Added cache locking (single-flight) so only one request refreshes data Introduced randomized cache expiry (TTL jitter) to avoid simultaneous expiration Used stale-while-revalidate approach for smoother refresh Now: Only one request hits DB Others wait or get cached response System stays stable. --- What I learned:-- Caching reduces load… but poorly managed caching can create bigger spikes than no cache at all. --- Question? Have you ever seen your system fail not because of traffic… but because many requests did the same thing at the same time? #Java #SpringBoot #Programming #SoftwareDevelopment #Cloud #AI #Coding #Learning #Tech #Technology #WebDevelopment #Microservices #API #Database #SpringFramework #Hibernate #MySQL #BackendDevelopment #CareerGrowth #ProfessionalDevelopment #RDBMS #PostgreSQL #backend
Fixing the Thundering Herd Problem with Cache Locking and TTL Jitter
More Relevant Posts
-
Race Conditions in Backend Systems:- A simple order service where users can place orders and inventory gets updated. Problem I faced :- Everything worked fine in testing. But in production, something weird started happening: Same product got sold more times than available Inventory went negative Duplicate updates started appearing No errors. No exceptions. Just wrong data. How I fixed it:- The issue was a race condition. Multiple requests were updating the same data at the same time. Here’s what helped: Added database-level locking for critical updates Used optimistic locking with version fields Introduced idempotency checks for repeated requests For high contention cases, used Redis distributed locks After that, updates became consistent again. What I learned: Concurrency issues don’t break loudly. They silently corrupt your data. And by the time you notice, it’s already too late. Question? Have you ever faced a bug where everything looked fine in logs… but the data was completely wrong? #Java #SpringBoot #Programming #SoftwareDevelopment #Cloud #AI #Coding #Learning #Tech #Technology #WebDevelopment #Microservices #API #Database #SpringFramework #Hibernate #MySQL #BackendDevelopment #CareerGrowth #ProfessionalDevelopment #RDBMS #PostgreSQL #backend
To view or add a comment, sign in
-
Sometimes everything in your system works fine. Then one day, traffic spikes… and multiple requests try to update the same data at the same time. Now you get weird issues: Duplicate orders Overbooked seats Negative inventory Not because of bugs. Because of concurrent updates. --- This is where Distributed Locking comes in The idea is simple: Only one process should modify a resource at a time. Everyone else has to wait. --- What actually happens Let’s say two requests try to update the same product stock. Without locking: Both read stock = 10 Both reduce it Final value becomes wrong With locking: First request gets the lock Second request waits Updates happen safely --- Where this is used Payment processing Inventory management Booking systems Scheduled jobs Anywhere consistency matters. --- Common ways to implement Database locks Simple, but can affect performance. Redis locks (like Redisson) Fast and commonly used in distributed systems. Zookeeper / etcd Used in large-scale systems. --- Why this matters In distributed systems: Multiple instances run in parallel Race conditions are common Data can get corrupted silently Locks help keep things consistent. --- But be careful Locks can slow things down. If not handled properly, they can even cause deadlocks. Use them only where necessary. --- Simple takeaway When multiple processes touch the same data, coordination becomes essential. --- Where in your system could two requests clash at the same time without you noticing? #Java #SpringBoot #Programming #SoftwareDevelopment #Cloud #AI #Coding #Learning #Tech #Technology #WebDevelopment #Microservices #API #Database #SpringFramework #Hibernate #MySQL #BackendDevelopment #CareerGrowth #ProfessionalDevelopment #RDBMS #PostgreSQL #backend
To view or add a comment, sign in
-
𝐎𝐮𝐫 𝐂𝐏𝐔 𝐡𝐢𝐭 𝟗𝟗% 𝐚𝐧𝐝 𝐧𝐨𝐭𝐡𝐢𝐧𝐠 𝐥𝐨𝐨𝐤𝐞𝐝 𝐰𝐫𝐨𝐧𝐠 We had rising CPU, slow APIs, and zero complex queries. Just plain UPDATEs. 𝐖𝐡𝐚𝐭 𝐰𝐞 𝐦𝐢𝐬𝐬𝐞𝐝 In PostgreSQL, UPDATE does not overwrite a row. It creates a new tuple and keeps the old one. Every update = more data. 𝐖𝐡𝐚𝐭 𝐡𝐚𝐩𝐩𝐞𝐧𝐞𝐝 𝐢𝐧 𝐩𝐫𝐨𝐝 • Same rows updated again and again • Dead tuples kept increasing • Tables silently bloated • Queries got slower over time 𝐓𝐡𝐞 𝐫𝐞𝐚𝐥 𝐢𝐬𝐬𝐮𝐞 Not bad queries. Not bad indexes. Just misunderstood database behavior. 𝐖𝐡𝐚𝐭 𝐟𝐢𝐱𝐞𝐝 𝐢𝐭 • Reduced unnecessary updates • Shortened transactions • Let vacuum catch up 𝐓𝐚𝐤𝐞𝐚𝐰𝐚𝐲 If your system updates the same rows frequently, you are not just updating data. You are creating more of it. And that adds up fast. #PostgreSQL #BackendEngineering #SystemDesign #DatabaseInternals #PerformanceOptimization #Scalability #SoftwareEngineering #MVCC #TechLearning #Java #SpringBoot
To view or add a comment, sign in
-
-
Timeouts (The Small Setting That Saves Your System) --- Built:- A service calling multiple downstream APIs to fetch and aggregate data. --- Problem I faced:- Everything worked fine… until one dependency slowed down. Then suddenly: Requests started hanging Thread pool got exhausted API response time shot up Entire service became slow All because one service was taking too long. --- How I fixed it:- The issue was missing timeouts. Requests were waiting indefinitely. Fixes applied: Added strict timeouts for all external calls Used fallback responses where possible Combined with circuit breaker for failing services Monitored slow calls with proper logging Now: Slow services don’t block everything System fails fast instead of hanging Overall stability improved --- What I learned A slow dependency is sometimes worse than a failed one. At least failures are quick. Slow calls quietly kill your system. --- Question:- Do your API calls have proper timeouts… or are they waiting forever without you noticing? #Java #SpringBoot #Programming #SoftwareDevelopment #Cloud #AI #Coding #Learning #Tech #Technology #WebDevelopment #Microservices #API #Database #SpringFramework #Hibernate #MySQL #BackendDevelopment #CareerGrowth #ProfessionalDevelopment #RDBMS #PostgreSQL #backend
To view or add a comment, sign in
-
Your API works fast locally… But becomes slow in production. Why does this happen? 👉 I’ve seen this multiple times in real systems. --- ❌ Common reasons: 1. N+1 Queries → One request triggers multiple DB calls 2. Blocking operations → Threads waiting unnecessarily 3. No caching → Repeated DB hits for same data 4. Poor database design → Unoptimized queries & indexes --- ✅ What actually helps: ✔️ Use caching (Redis) ✔️ Optimize queries & indexing ✔️ Use async processing where needed ✔️ Monitor performance (logs/metrics) --- 🧠 Reality: Performance issues don’t appear in development… They show up under real traffic. --- 💬 Curious: What’s the biggest performance issue you’ve faced in production? #Java #Backend #Performance #SystemDesign #Microservices #LearningInPublic
To view or add a comment, sign in
-
🚀 Backend Learning | Caching vs Database — When to Use What? While working on backend systems, I recently explored an important decision — when to use cache and when to rely on the database. 🔹 The Problem: • Frequent DB calls increasing latency • Need for faster responses under heavy traffic • Balancing performance with data consistency 🔹 What I Learned: • Cache (Redis): Best for frequently accessed, read-heavy data • Database: Best for reliable, consistent data storage • Cache improves speed, DB ensures correctness 🔹 Key Trade-offs: • Cache → Fast but may serve stale data • DB → Accurate but slower under load • Choosing depends on use-case and consistency requirements 🔹 Outcome: • Better performance optimization decisions • Improved system design thinking • Balanced speed vs consistency Good backend design is not about choosing one — it’s about choosing the right tool at the right time. 🚀 #Java #SpringBoot #Redis #Database #SystemDesign #BackendDevelopment #LearningInPublic
To view or add a comment, sign in
-
-
Your app queries the same data 1000 times a day. The data barely changes. You're hammering your database for no reason. Fix: Spring Cache - add it in minutes. Step 1: Enable caching @SpringBootApplication @EnableCaching public class MyApp { ... } Step 2: Cache expensive method results @Service public class ProductService { @Cacheable("products") public List getAllProducts() { // this DB call only runs on the FIRST request // subsequent calls return from cache instantly return productRepo.findAll(); } @CacheEvict(value = "products", allEntries = true) public void addProduct(Product p) { // clears cache when data changes productRepo.save(p); } } Key annotations: @Cacheable → cache the result @CacheEvict → clear the cache @CachePut → update cache without skipping the method By default uses in-memory cache. Swap to Redis for distributed caching with one config change. Same annotation. Massive performance gain. #Java #SpringBoot #Caching #BackendDevelopment #LearningInPublic #Performance
To view or add a comment, sign in
-
I've been heads-down building for a while and haven't shared much here. Changing that. Here's something I keep running into: teams reach for new infrastructure when the real problem is their queries. At a previous company, our API response times were climbing. The conversation started drifting toward caching layers, read replicas, maybe a new service. Before any of that, I spent a day with EXPLAIN ANALYZE and pg_stat_statements. What I found: → A few joins were scanning full tables because indexes didn't match the actual query patterns in production → One N+1 had been there so long everyone assumed "that endpoint is just slow" → A couple of queries were sorting in Ruby that PostgreSQL could have sorted faster itself Three changes. No new infrastructure. API response times dropped by over 60%. The lesson I keep relearning: Most performance problems aren't architecture problems. They're query problems. And query problems are cheap to fix if you measure before you redesign. If your API feels slow, run EXPLAIN ANALYZE before you add a service. You might save yourself months. Everyone's talking about AI-powered observability. Meanwhile, EXPLAIN ANALYZE is free and tells you exactly what's wrong. #postgresql #backendengineering #softwareengineering
To view or add a comment, sign in
-
Stop hiding SQL. Start owning it. After working with different approaches, one thing became clear. sqlc is the cleanest way to handle data in serious backend systems. You write real SQL. What you write is what runs. No guessing, no surprises. Queries are checked at compile time, not in production. There is no hidden behavior. No unexpected joins. No performance issues showing up later. The structure stays clean. SQL, generated code, repository, service. Easy to follow, easy to maintain. ORMs feel fast in the beginning. But as systems grow, they bring hidden complexity and make debugging harder. With sqlc, you stay in control from day one. If you are building APIs, microservices, or anything that needs to scale, this approach just makes more sense. #sqlc #golang #backend #softwareengineering #microservices #postgresql #cleanarchitecture #api #webdevelopment
To view or add a comment, sign in
-
-
Your app queries the same data 1000 times a day. The data barely changes. You're hammering your database for no reason. Fix: Spring Cache - add it in minutes. Step 1: Enable caching @SpringBootApplication @EnableCaching public class MyApp { ... } Step 2: Cache expensive method results @Service public class ProductService { @Cacheable("products") public List<Product> getAllProducts() { // this DB call only runs on the FIRST request // subsequent calls return from cache instantly return productRepo.findAll(); } @CacheEvict(value = "products", allEntries = true) public void addProduct(Product p) { // clears cache when data changes productRepo.save(p); } } Key annotations: @Cacheable → cache the result @CacheEvict → clear the cache @CachePut → update cache without skipping the method By default uses in-memory cache. Swap to Redis for distributed caching with one config change. Same annotation. Massive performance gain. #Java #SpringBoot #Caching #BackendDevelopment #LearningInPublic #Performance
To view or add a comment, sign in
Explore related topics
Explore content categories
- Career
- Productivity
- Finance
- Soft Skills & Emotional Intelligence
- Project Management
- Education
- Technology
- Leadership
- Ecommerce
- User Experience
- Recruitment & HR
- Customer Experience
- Real Estate
- Marketing
- Sales
- Retail & Merchandising
- Science
- Supply Chain Management
- Future Of Work
- Consulting
- Writing
- Economics
- Artificial Intelligence
- Employee Experience
- Workplace Trends
- Fundraising
- Networking
- Corporate Social Responsibility
- Negotiation
- Communication
- Engineering
- Hospitality & Tourism
- Business Strategy
- Change Management
- Organizational Culture
- Design
- Innovation
- Event Planning
- Training & Development