Optimize API Performance with Background Workers

Your API is slow because it's doing too much before it responds. A user places an order. Your endpoint saves it, charges payment, sends an email, generates an invoice, updates inventory. Then it responds. That payment call? 5 to 25 seconds. Thousands of requests during a flash sale? Thousands of blocked threads. Provider goes down? Your entire API goes down. But the user only needs one answer: "Did you get my order?" That's it. Everything else can happen after. The fix is one architectural shift: → API saves the order to the database → Queues the heavy work for a background worker → Returns "received" in ~50ms The worker picks it up and handles the rest: Charge payment Send email Generate invoice Update inventory If something fails, it retries with exponential backoff. If all retries fail, the user gets notified AND the engineering team gets an alert with the full traceback. Nobody is left in the dark. Three things I learned building this in production: 1. Save to the database before queuing. If the worker crashes, the order still exists. The DB is your safety net. 2. Use Celery's on_failure() hook. Define it once in a custom base class. When retries run out, it automatically notifies users and alerts your team. No scattered try/except blocks. 3. Your API is a receptionist, not a worker. It takes the request, confirms receipt, and hands it off. The real work happens in the background. What's the slowest thing your API does before responding? ↓ Full blog post with architecture diagram and code in the comments #Python #SoftwareEngineering #SystemDesign #BackendDevelopment #Celery

1 Comment

Muhammad Huzaifa Ali 4w

Here's the full blog post with the architecture diagram, code examples, and failure handling patterns:https://medium.com/p/95b27b1e5f0b?postPublishedType=initial

To view or add a comment, sign in

More Relevant Posts

Kuldeep Kumawat
2w
Report this post
I've 𝗱𝗲𝗯𝘂𝗴𝗴𝗲𝗱 all 5 of these in production. Every single one looked fine in dev. 𝟭. 𝗠𝗶𝘀𝘀𝗶𝗻𝗴 𝗶𝗻𝗱𝗲𝘅 𝗼𝗻 𝘁𝗵𝗲 𝗙𝗞 𝗰𝗼𝗹𝘂𝗺𝗻 Your JOIN was fine with 10 rows. It wasn't fine with 10 million. One `CREATE INDEX`. Same query. Same data. 5,000x faster. 𝟮. 𝗦𝗘𝗥𝗜𝗔𝗟𝗜𝗭𝗔𝗕𝗟𝗘 𝗶𝘀𝗼𝗹𝗮𝘁𝗶𝗼𝗻 𝗼𝗻 𝗲𝘃𝗲𝗿𝘆 𝘁𝗿𝗮𝗻𝘀𝗮𝗰𝘁𝗶𝗼𝗻 "For safety." It triggered 3x latency spikes at 1,000 concurrent writers. READ COMMITTED handles 95% of real production workloads. 𝟯. 𝗢𝗥𝗠 𝗹𝗮𝘇𝘆-𝗹𝗼𝗮𝗱𝗶𝗻𝗴 𝗶𝗻 𝗮 𝗹𝗼𝗼𝗽 1 API call. 847 database queries. Your ORM logged none of it. 5 users: fast. 500 users: 8 second timeout. 𝟰. 𝗨𝗨𝗜𝗗 𝘃𝟰 𝗮𝘀 𝘁𝗵𝗲 𝗽𝗿𝗶𝗺𝗮𝗿𝘆 𝗸𝗲𝘆 Random inserts fragment the B-tree. 40-60% slower writes at 10M rows. UUID v7 is sequential. Same format. None of the cost. 𝟱. 𝗢𝗙𝗙𝗦𝗘𝗧 𝗽𝗮𝗴𝗶𝗻𝗮𝘁𝗶𝗼𝗻 𝗽𝗮𝘀𝘁 𝟭𝟬𝟬𝗞 𝗿𝗼𝘄𝘀 `OFFSET 500000` scans half a million rows and throws every one away. p99: 8 seconds. Cursor pagination: 1ms. Same database. The pattern is always the same: works in dev, breaks in production, costs a weekend to find. All 5 breakdowns are in the image. One topic. One mistake. One fix per card. Save this before you deploy your next feature. Which one have you already shipped to production? (9 -13)/40 - All About Backend Engineering Save 📌 to refer it later, Repost ♻️ to help a engineer Follow @Kuldeep Kumawat to learn about scaling #BackendEngineering #Database #SystemDesign

2 Comments
Like Comment
To view or add a comment, sign in
Roei Michael
2w Edited
Report this post
Ever wonder why your Claude Code session suddenly burned 50k tokens in one turn?🐱 If you use Claude Code a lot, i am sure you have also hit this wall, your session suddenly gets expensive, context fills up unexpectedly, and you have no idea why. Was it that Bash command that searched your entire repo? The Read that loaded a 3,000-line config file? You're left guessing. I spent the past week building CAT (Context Analyzer Terminal) to solve exactly that. What it does: → Hooks silently into Claude Code sessions → Tracks token cost per individual tool call — Read, Bash, Grep, etc. → Builds rolling baselines using Welford's algorithm → Fires a real-time alert the moment something exceeds your normal baseline (Z-score detection) → Gives you a plain-English explanation of why something was expensive → Shows burn rate projection, cache efficiency, and overhead ratio → Live Rich TUI dashboard — runs entirely locally The non-obvious engineering problem: Claude Code hooks fire tool events and token snapshots as two separate streams — neither includes the other's data. The core of CAT is a delta engine that correlates them by session ID and timestamps to compute per-call cost attribution. Setup is 3 commands. MIT licensed. 113 tests. CI passing on macOS, Ubuntu, and Windows across Python 3.11–3.13. 🔗 GitHub: https://lnkd.in/dV69pHvs I'm actively looking for contributors — there are curated good-first-issues ranging from one-liners to full features. If you're into Python, async systems, or developer tooling, take a look. What token visibility features would make Claude Code more useful for you? Drop a comment — building this in public and all feedback shapes the roadmap. כבר לא חתול בשק! 🐱 בניתי כלי open-source שחוסך לכם את הניחושים ומראה בדיוק איזה tool call "אוכל" לכם את ה-context window ב-Claude Code. כל מי שמשתמש ב-Claude Code מכיר את הרגע הזה: הסשן פתאום נהיה יקר, הקונטקסט מתמלא בלי התראה, ואין לכם מושג למה. האם זה היה ה-Bash command שסרק בטעות את כל הריפו? או קובץ קונפיגורציה ענק שנטען ב-Read? בשבוע האחרון פיתחתי את CAT (Context Analyzer Terminal) כדי לפתור בדיוק את זה. מה זה נותן? ← ניטור שקט של סשנים ב-Claude Code. ← מעקב אחרי עלות טוקנים לכל פעולה בנפרד (Read, Bash, Grep וכו'). ← זיהוי חריגות בזמן אמת (Z-score) המבוסס על אלגוריתם Welford. ← הסברים ברורים למה פעולה מסוימת הייתה יקרה. ← תחזית Burn rate, יעילות Cache ויחס Overhead. ← Dashboard מקומי בטרמינל (Rich TUI). האתגר ההנדסי כאן היה לחבר בין שני סטרים נפרדים של מידע (tool events ו-token snapshots) שקלאוד מוציא ללא זיהוי מקשר. המנוע של CAT מבצע קורלציה מבוססת זמן ו-session ID כדי לייחס עלות מדויקת לכל קריאה. ההתקנה פשוטה (3 פקודות), הקוד ב-MIT, ויש כבר מעל 100 טסטים שעוברים ב-CI. אני מחפש תורמים לפרויקט! יש המון good-first-issues פתוחים. אם אתם בתוך Python, מערכות async או dev-tools — מוזמנים להציץ בגיטהאב 🔗 GitHub: https://lnkd.in/dV69pHvs #OpenSource #Python #DeveloperTools #ClaudeCode #AI #BuildInPublic

2 Comments
Like Comment
To view or add a comment, sign in
Nikhil Bhatt
3w
Report this post
Your logs are lying to you. Not because logging is useless… But because you’re logging the wrong things. --- 👉 Most backend devs think logging = "console.log()" That’s not logging. That’s noise. --- What beginners do: console.log("User logged in"); console.log("Error occurred"); Looks fine. But in production? ❌ Useless for debugging ❌ No context ❌ No traceability --- Real problem: When something breaks in production… You don’t know: - Which user? - Which request? - What triggered it? - What happened before it? --- So you panic. And start guessing. --- What strong backend engineers log: ✔ Request ID (trace every request) ✔ User ID (if available) ✔ Route + method ✔ Status code ✔ Error stack (not just message) ✔ Timestamp --- Example (real logging): logger.info({ requestId: "abc123", userId: "user_42", method: "POST", route: "/api/orders", status: 500, error: err.stack, timestamp: new Date() }); ⚠️ Never log sensitive data (passwords, tokens, PII). Logs are often stored and shared — treat them as public --- This changes everything: Now you can: ✔ Trace a request end-to-end ✔ Debug production issues fast ✔ Understand real user behavior --- But here’s what most still ignore: Logs without structure = garbage. --- Level up your logging: ✔ Use structured logs (JSON) ✔ Use tools (Winston / Pino) ✔ Centralize logs (ELK / cloud logging) ✔ Add log levels (info, warn, error) --- Brutal truth: If you can’t debug your system in production… 👉 You don’t understand your system. --- Takeaway: Logging isn’t printing. 👉 It’s observability. --- Tomorrow: I’ll break down why your database queries are slow (and it’s not your DB’s fault). #BackendDevelopment #NodeJS #SystemDesign #Debugging #SoftwareEngineering
Like Comment
To view or add a comment, sign in
Ofek Malka
4d Edited
Report this post
🚀 𝗜𝘀 𝘁𝗵𝗲 "𝗙𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 𝗔𝗴𝗲" 𝗼𝗳 𝗝𝗮𝘃𝗮 𝗲𝗻𝗱𝗶𝗻𝗴? For years, we’ve been told to hide our data behind layers of "magic" ORMs and complex abstractions. We traded control for convenience, but in high-integrity industries, that convenience often comes with a hidden tax: unpredictable state and opaque execution. Lately, I’ve been exploring a different path: 𝗗𝗮𝘁𝗮-𝗢𝗿𝗶𝗲𝗻𝘁𝗲𝗱 𝗦𝗼𝘃𝗲𝗿𝗲𝗶𝗴𝗻𝘁𝘆. Instead of fighting framework proxy logic or complex lifecycle management, what happens when you treat SQL as a first-class citizen and generic data structures as the ultimate source of truth? The results are striking: ✅ Zero-Dependency Architecture. ✅ Total control over the physical metal (SQL). ✅ Immutable state transitions that are actually auditable. I’m often asked: "𝘉𝘶𝘵 𝘸𝘪𝘵𝘩 𝘗𝘳𝘰𝘫𝘦𝘤𝘵 𝘓𝘰𝘰𝘮 𝘢𝘯𝘥 𝘝𝘪𝘳𝘵𝘶𝘢𝘭 𝘛𝘩𝘳𝘦𝘢𝘥𝘴, 𝘸𝘩𝘺 𝘣𝘰𝘵𝘩𝘦𝘳 𝘸𝘪𝘵𝘩 𝘙𝘦𝘢𝘤𝘵𝘪𝘷𝘦 𝘱𝘳𝘰𝘨𝘳𝘢𝘮𝘮𝘪𝘯𝘨 𝘢𝘯𝘺𝘮𝘰𝘳𝘦?" The answer isn't about thread-blocking. It’s about 𝗙𝗹𝗼𝘄 𝗜𝗻𝘁𝗲𝗴𝗿𝗶𝘁𝘆. Virtual threads handle concurrency, but Reactive (Mutiny) handles 𝗟𝗼𝗴𝗶𝗰. It’s the difference between a "Precision Hammer" and a "High-Velocity Turbine." It’s about building systems that don't just "run," but "react"—handling backpressure, stream composition, and circuit-breaking as fundamental laws of the engine, not as afterthoughts. We are moving away from "Disposable Grade" software. The future belongs to "Industrial Grade" systems where the architect owns the perimeter, not the framework. Who else is stripping back the abstractions to get closer to the metal? ⚔️ #Java #SoftwareArchitecture #ReactiveProgramming #DataOriented #BackendDevelopment #CleanCodeented #BackendDevelopment #CleanCode
Like Comment
To view or add a comment, sign in
Pablo Alvarez
1w Edited
Report this post
I've been using Claude Code to write R scripts for a while now. It works. But it kept suggesting outdated patterns like this: 👎🏽 # Old data %>% filter(x > 0) group_by(x) %>% summarise(y) %>% ungroup() left_join(x, y, by = c("a" = "b")) sapply(x, f) 👍🏽 # What it should write data |> filter(x > 0) summarise(data, mean(y), .by = x) inner_join(x, y, by = join_by(a == b)) map_dbl(x, f) 🥲 Outdated syntax. Silent errors I'd only catch way downstream. Last week I came across something that changed all of that. 👾 Sarah Johnson created the Modern R Development Guide to help Claude Code behave like a modern R user. Alistair Bailey built on Sarah's work (and others') to create Claude Code R Skills. These skills are brilliant. I can't recommend them enough. → tidyverse-patterns → r-style-guide → r-performance → rlang-patterns 𝗠𝘆 𝗳𝗮𝘃𝗼𝗿𝗶𝘁𝗲 𝘀𝗸𝗶𝗹𝗹 𝘀𝗼 𝗳𝗮𝗿: tidyverse-patterns It doesn't just know tidyverse. It knows the patterns you actually should be using! Functions like filter_out that I didn't even know existed. Claude Code started using them in my scripts before I had even thought to look for them. That alone was worth it. I now regularly run these prompts in Claude Code to improve my R scripts: "Read [script.R] and rewrite it applying the tidyverse-patterns and r-style-guide skills" "Read [script.R] and apply r-performance optimizations without changing the logic" 💸 And because we all know how fast Claude tokens go... the repo includes settings for tokens optimization too. 🙌🏽 → Switching default model from Opus to Sonnet: 60% cost reduction, handles 80%+ of tasks just as well → Capping thinking tokens from 32k to 10k: 70% reduction in hidden cost, zero noticeable quality loss If you use R and Claude Code together, this is worth your time! Credit: Alistair Bailey / ab604 · Sarah Johnson / sj-io
11 Comments
Like Comment
To view or add a comment, sign in
Satyam Parmar
2w
Report this post
📜 Logs don’t become useful at scale. They become noise. When your system is small, logs feel powerful. At scale? They overwhelm you. --- 🔍 The logging illusion Early stage: ✔️ Few services ✔️ Low traffic ✔️ Easy debugging Logs work well. At scale: ❌ Millions of log lines per minute ❌ Hard to correlate across services ❌ Signal buried in noise ❌ Expensive storage ❌ Slow search during incidents More logs ≠ more visibility. --- 💥 Real production scenario Incident occurs. Team opens log dashboard. Sees: Thousands of errors Millions of info logs Repeated stack traces No clear root cause. Meanwhile: Latency rising Users impacted Time wasted searching Logs existed. Insight didn’t. --- 🧠 How senior engineers handle logs They design logging intentionally. ✔️ Structured logs (JSON, correlation IDs) ✔️ Log levels used correctly ✔️ Sample high-volume logs ✔️ Correlate with metrics & traces ✔️ Focus on actionable events They don’t log everything. They log what matters. --- 🔑 Core lesson Logs are raw data. Observability is understanding. If your logs don’t guide you to answers, they’re just expensive text. At scale, clarity beats volume. --- Subscribe to Satyverse for practical backend engineering 🚀 👉 https://lnkd.in/dizF7mmh If you want to learn backend development through real-world project implementations, follow me or DM me — I’ll personally guide you. 🚀 📘 https://satyamparmar.blog 🎯 https://lnkd.in/dgza_NMQ --- #BackendEngineering #Observability #SystemDesign #DistributedSystems #Microservices #Java #Scalability #Logging #Satyverse
Like Comment
To view or add a comment, sign in
Swayam Siddha Panda
1mo
Report this post
Calling a distance API for every vendor is a design mistake, not a scaling problem. ⚡ I was building a “nearest vendor” feature. The naive approach was simple: for vendor in vendors: distance = get_distance(user_location, vendor.location) It worked… until vendors grew. More vendors = more API calls = higher cost and latency. The issue wasn’t the API. It was how we were using it. Before calling the API, I added a simple pre-filter using a bounding box: def get_nearby_vendors(user_lat, user_lng, radius): return Vendor.objects.filter( lat__range=(user_lat - radius, user_lat + radius), lng__range=(user_lng - radius, user_lng + radius) ) Now only a small subset goes into the expensive API. What changed: • API calls dropped significantly • Response time improved • Cost reduced • Output stayed practically accurate Tradeoff: • Slight risk of missing edge cases • Requires tuning of radius Insight: Don’t use expensive services for what your database can handle. Rule: Filter first → Compute later Have you optimized API usage like this in your system? #SoftwareEngineering #BackendDevelopment #SystemDesign #Django #Python #Performance #Scalability #APIDesign #Developers
Like Comment
To view or add a comment, sign in
Mahmoud Ashraf Ali
6d
Report this post
🚀 I'm Starting a New Series — And I Think You'll Find It Useful Over the next days, I'll be posting one topic per day on Backend Communication Design Patterns. No fluff. No theory for the sake of theory. Just the patterns that actually show up in real systems — explained clearly, with examples you can relate to. Let's begin with 🔁 Request / Response — The Pattern Running the Internet Every time you open a website, run a SQL query, or call a REST API — you're using Request/Response. It's the most fundamental backend communication pattern. Understanding it deeply is what separates engineers who just use it from those who design with it. ───────────────────────────── 🔄 HOW IT WORKS ───────────────────────────── 1️⃣ Client sends a Request 2️⃣ Server parses the Request 3️⃣ Server processes the Request 4️⃣ Server sends a Response 5️⃣ Client parses the Response and consumes it Simple. But the devil is in the details. ───────────────────────────── 📍 WHERE YOU'LL FIND IT ───────────────────────────── → HTTP, DNS, SSH → RPC (Remote Procedure Call) → SQL and database protocols → REST / SOAP / GraphQL APIs ───────────────────────────── 📐 THE ANATOMY MATTERS ───────────────────────────── Every request has a boundary — a start and end defined by the protocol and message format. The server knows exactly where your headers end and data begins because of this contract. In HTTP/1.1 it looks like this: GET / HTTP/1.1 Host: example.com Content-Type: application/json <CRLF> ← this blank line is the boundary { body } ───────────────────────────── 📤 EVEN FILE UPLOADS FOLLOW THIS PATTERN ───────────────────────────── → Simple: send the whole image in one large request → Resumable: chunk the image, send one request per chunk If the connection drops → only the last chunk needs retransmitting ───────────────────────────── ⚠️ BUT IT DOESN'T WORK EVERYWHERE ───────────────────────────── Request/Response breaks down when you need: ❌ Real-time notifications (chat apps, live feeds) ❌ Very long-running requests ❌ The server to talk first — without the client asking ❌ Handling client disconnections mid-processing These gaps are exactly why the other patterns exists (we will discuss them in the coming posts isa). ───────────────────────────── ✅ PROS | ❌ CONS ───────────────────────────── ✅ Simple and elegant ✅ Scalable ✅ Works everywhere — HTTP, DNS, RPC, SQL ❌ Not suitable for real-time use cases ❌ Bad for multiple receivers ❌ High coupling between client and server ❌ Breaks when client disconnects #BackendEngineering #SystemDesign #SoftwareEngineering #LearningInPublic #Programming
Like Comment
To view or add a comment, sign in
Lam Nguyen
2w
Report this post
C# just got extension properties, and that might be the biggest syntax change since async/await. 𝗪𝗵𝗮𝘁 𝗵𝗮𝗽𝗽𝗲𝗻𝗲𝗱 C# 14 shipped with .NET 10, and the headline feature is extension members — a new syntax that goes beyond extension methods. Per the official docs, you can now declare extension properties, extension operators, and static extension members using a new extension block syntax. 𝗪𝗲𝘆 𝗳𝗲𝗮𝘁𝘂𝗿𝗲𝘀 • Extension members — declare extension properties and operators, not just methods. The new extension block syntax is source and binary compatible with existing extension methods. • The field keyword — access a property’s compiler-generated backing field directly in get/set accessors. Eliminates boilerplate private fields for simple validation logic. • Null-conditional assignment — use ?. and ?[] on the left side of assignments. The right side evaluates only when the receiver isn’t null. • Implicit span conversions — first-class support for Span and ReadOnlySpan with implicit conversions from arrays, reducing ceremony in allocation-free APIs. • Lambda parameter modifiers — add ref, in, out, scoped to lambda parameters without specifying types. → The .NET Blog describes these performance features as enabling “fewer temporary variables, fewer bounds checks, and more aggressive inlining.” 𝗪𝗵𝘆 𝗶𝘁 𝗺𝗮𝘁𝘁𝗲𝗿𝘀 Extension members address one of the longest-standing feature requests in C# history. The ability to add properties and operators to types you don’t own changes how libraries are designed — particularly for fluent APIs and LINQ-style patterns. The span and compound assignment changes are less visible but may have a larger runtime impact, since .NET 10’s core libraries already use them internally for performance gains. Link in comments. #AI #AINews #CSharp #DotNet #DotNet10 #SoftwareEngineering
1 Comment
Like Comment
To view or add a comment, sign in
Vivek Warkade
1w
Report this post
Most of us have requests baked into our muscle memory, but as web standards move toward HTTP/3 and high-concurrency, the "old reliable" is starting to show its age. I’ve been diving into Niquests, and it’s a serious contender for the new standard. It’s designed as a drop-in replacement, meaning you get a massive performance boost without the headache of a refactor. What makes it a "pro" choice: - Protocol Support: It handles HTTP/2 and HTTP/3 natively. If you're hitting modern APIs, this isn't just a "nice to have"—it’s a massive efficiency gain. - Multiplexing: You can send multiple requests over a single connection. This eliminates the handshake overhead that usually slows down bulk data fetching. - True Async Compatibility: Unlike the original requests library, this is built to play nice with asyncio, making it ideal for high-traffic backend services. - Performance: In standard benchmarks, it significantly outperforms HTTPX and AIOHTTP in request-heavy loops. If you’re building production-grade scrapers, microservices, or data pipelines, the switch is almost a no-brainer. It’s the same API we love, just supercharged for 2026. Check out the project on GitHub: https://lnkd.in/d98Zy_cc #Python #SoftwareEngineering #Backend #Performance #DataEngineering #OpenSource

GitHub - jawah/niquests: Drop-in replacement for Requests. Automatic HTTP/1.1, HTTP/2, and HTTP/3. WebSocket, and SSE included. github.com
Like Comment
To view or add a comment, sign in

637 followers

297 Posts

View Profile Connect

Optimize API Performance with Background Workers

More Relevant Posts

Explore content categories