Idempotency in Distributed Systems: A Common Bug and Easy Fix

One of the most common distributed-systems bugs I've seen across different teams is the duplicate-charge or duplicate-action problem. It usually comes down to a missing idempotency key. The pattern: a client retries a request after a timeout. The original request had actually succeeded - the response just never made it back. Result: two charges, one unhappy user, one urgent on-call ticket. Idempotency in REST APIs is one of those topics that sounds straightforward until it surfaces in production. The mental model I use: GET, PUT, DELETE → idempotent by HTTP spec. Same request, same result, no side effects on retry. POST → not idempotent by default. This is where most idempotency bugs come from. The fix isn't complicated. For any POST that creates or modifies state, accept an Idempotency-Key header from the client. Store the key and the response in a short-lived cache (Redis works well, with a TTL of a few hours to a day). On a retry with the same key, return the cached response instead of re-processing. Three things that often get missed: 1. The key has to be generated by the client, not the server. Server-generated keys defeat the purpose. 2. Cache the response, not just a "this key was used" flag. Otherwise the retry gets a different shape than the original. 3. Scope keys per endpoint or per user. Global key spaces lead to weird collisions. Idempotency is one of the lowest-cost, highest-value patterns you can build into a distributed system. Easy to skip during initial design. Painful to retrofit after an incident. What's your team's pattern for this - header-based, request-hash-based, or something else? #Java #SpringBoot #Microservices #DistributedSystems #BackendEngineering #SoftwareEngineering #APIDesign #TechLeadership

To view or add a comment, sign in

Explore content categories