Optimizing API Latency with Asyncio in Python

A 40ms API became a 4ms API. Here's the only thing that changed. We were making 3 separate DB queries to assemble a response. Each was fast in isolation. Together, they were sequential — each waited for the previous. The fix: run them concurrently. In Python (asyncio), this went from: result_a = await get_a() result_b = await get_b() result_c = await get_c() To: result_a, result_b, result_c = await asyncio.gather(get_a(), get_b(), get_c()) That's it. No caching, no infra change, no complex refactor. The mental model that helps: always ask "are these operations actually dependent on each other?" before assuming they need to run in sequence. Most API latency problems aren't hard — they're just unexamined. #BackendDevelopment #PythonAsyncio #APIOptimization #SoftwareEngineering

9 Comments

Rohit Kumar 1w

this can be achieved using threading also. which one to use threading or asyncio ?

1 Reaction

Vanshika Sharma 2w

Very informative

1 Reaction

See more comments

To view or add a comment, sign in

More Relevant Posts

Timothy Zatloff
3w
Report this post
called the same API endpoint 5 times in a row. without cache: 2.51s with lru_cache: 0.50s 5x faster. two lines of code. @functools.lru_cache(maxsize=128) def fetch_user(user_id): ... the cache info tells the real story: hits=4, misses=1 first call hits the actual API. next 4? served instantly from memory. this is how production systems handle repeated expensive calls — user profiles, config lookups, ML model loads, anything that doesn’t change every second. lru_cache ships with Python. no libraries. just import functools. two lines between slow and fast. #Python #Backend #DataEngineering #Performance
4 Comments
Like Comment
To view or add a comment, sign in
James Brosnahan
3w
Report this post
.0158 vs .0005 for the cached version. So searching bing: "does python lru cache return previous objects" "Yes — Python’s built‑in functools.lru_cache returns the exact same object instance that was previously computed and cached, not a copy" The overhead is in the object being recreated each call with Python objects being known to have slow creation time. There are better options for performance like writing the API in C++ with pistache or crow. Testing the time with 4 million unique users requesting their user info 3 times would be more informative. Reading that the returned data is a user data object with the changing value being a score and a constant for the username, the code needs refactoring as it muddies two use cases together. The username only needs sent the first time then only if it is or has been updated. The score is better sent via a socket or websocket if it changes in realtime and requires input from the server to be calculated or not sent at all if it can be calculated client side. If it needs to be broadcast to other client network peers with their response sent back to other peers a message queue is needed but if the peers response does not matter, the main server can handle the broadcasting. Database queries that can not just be returned by directly querying the database are not conducive to caching or not useful if they change infrequently or are only needed once or a few times at most. Having less than 4 million users, giving each user their own database on a single server can be easier than writing APIs if the data is just database table views (and the service is paid, reducing risk of hacking from users plus database caching can be used across multiple client applications)
Timothy Zatloff

Python Backend Engineer — APIs, PostgreSQL & distributed systems | Reliability-first builder | NYC
3w

called the same API endpoint 5 times in a row. without cache: 2.51s with lru_cache: 0.50s 5x faster. two lines of code. @functools.lru_cache(maxsize=128) def fetch_user(user_id): ... the cache info tells the real story: hits=4, misses=1 first call hits the actual API. next 4? served instantly from memory. this is how production systems handle repeated expensive calls — user profiles, config lookups, ML model loads, anything that doesn’t change every second. lru_cache ships with Python. no libraries. just import functools. two lines between slow and fast. #Python #Backend #DataEngineering #Performance
Like Comment
To view or add a comment, sign in
DeepRead.Tech

6 followers
2w Edited
Report this post
We think document extraction should be simple. Less than 10 lines of Python to extract structured data from any document. Define your schema, send a file, get JSON back. About 10 lines of code. Uncertain fields get flagged and you decide what to do with them. Learn how to define schemas: https://lnkd.in/g7TH8VmD
1 Comment
Like Comment
To view or add a comment, sign in
Moazzam Azam
1w
Report this post
FastAPI just unlocked a massive performance ceiling. 🚀 With the official release of FastAPI 0.136.0 supporting free-threaded Python (No-GIL) , I couldn't just read the changelog—I had to benchmark it. I ran a controlled, head-to-head comparison using identical code and identical hardware: ⚙️ Python 3.12 (GIL) vs. Python 3.13.0t (No-GIL) The result? A ~8x improvement in CPU-bound throughput. Same code. Same API. Zero changes. This is a game-changer for anyone running: 🔹 ML Inference APIs (real-time model serving) 🔹 Data Processing & ETL Workloads 🔹 CPU-Intensive Backend Services Is this the final nail in the coffin for the GIL bottleneck? Curious to hear what the Python backend community thinks. #FastAPI #Python #NoGIL #PerformanceEngineering #BackendDevelopment #Concurrency #MachineLearning
Like Comment
To view or add a comment, sign in
Dimas Brizuela
1mo
Report this post
QuillSort — A data sorter Created by Isaiah Tucker Most of the time, Python’s built-in sorted() and list.sort() are all you need. But if you ever try to sort a lot of data—millions to billions of values, big numeric logs, or giant SQL exports—you quickly run into a wall: RAM, speed, or both. So I built Quill-Sort (quill-sort on PyPI). / ... link https://lnkd.in/eHaFZyx4 pubDate Wed, 01 Apr 2026 03:29:53 +0000
Like Comment
To view or add a comment, sign in
GyaanSetu WebDev

610 followers
4d
Report this post
𝗜𝗻𝘀𝘁𝗮𝗻𝗰𝗲 𝗠𝗲𝘁𝗵𝗼𝗱𝘀 A function inside a class is a method. The __init__ method is special. It sets up your new object. It initializes your data. New coders often forget one thing. They forget the self parameter. You must put self as the first argument in every instance method. Python sends the object reference to the method automatically. If you leave out self, your code fails. You get an error. Follow these rules for your methods: - Put self as the first parameter. - Use self to access object data. - Name your initialization method __init__. Source: https://lnkd.in/gMGDYKUz
Like Comment
To view or add a comment, sign in
Daniel Chuks
1w
Report this post
Path vs. Query Parameters — Know the difference! One of the most common questions when building APIs is: "Should this go in the URL path or as a query string?" In FastAPI, the distinction is clean and easy to implement. 📍 Path Parameters: Used to identify a specific resource. Example: /users/{user_id} Use these when the data is mandatory to find the object. 🔍 Query Parameters: Used for filtering, sorting, or pagination. Example: /users?active=true&sort=desc Use these for optional parameters that modify the results. FastAPI is smart enough to distinguish them just by how you define your function arguments. If it's in the path, it's a Path Param. If it’s not, it’s a Query Param. Simple as that! 🚀 #Python #FastAPI #WebDevelopment #Backend #RESTAPI #CodingTips #30DaysOfFastAPI
Like Comment
To view or add a comment, sign in
Shreyash Mhashilkar
2w
Report this post
Most FastAPI codebases look clean at first glance. Until you try to change something. I’ve noticed a pattern — a lot of complexity doesn’t come from the problem itself, but from where the logic lives. When routes start handling more than just request/response, things get harder to reason about. Lately, I’ve been keeping one constraint: Routes should stay thin. They handle the HTTP layer. All business logic moves to services. It’s a small shift, but it changes a lot: 1) Clearer separation of concerns 2) Easier testing 3) Fewer side effects when making changes Also started appreciating dependency injection more. Not as a framework feature, but as a way to keep things decoupled and predictable. Nothing groundbreaking here. But in a time where a lot of code is being generated faster than it’s being designed, maintainability comes down to how consistently we apply these basics — not whether we know them. Curious how others approach structuring FastAPI projects at scale. #FastAPI #BackendDevelopment #CleanCode #SoftwareEngineering #Python
Like Comment
To view or add a comment, sign in
Khushboo Banjara
1mo
Report this post
Day 2 of my LeetCode journey 🚀 Today’s problem: Group Anagrams This challenge was all about grouping strings that share the same characters. I approached it using a dictionary + hashing strategy in Python. For each word, I sorted its characters and used that as a key (converted into a tuple), ensuring all anagrams map to the same bucket. Here’s the core logic I implemented: ▪️Traverse the list of strings ▪️Sort each string → convert to tuple → use as dictionary key ▪️Append original string to the corresponding group ▪️Finally, return all grouped values This approach keeps the implementation clean and scalable. Time Complexity: ▪️Sorting each string takes O(k log k) (where k = length of string) ▪️For n strings → O(n * k log k) overall Space Complexity: ▪️O(n * k) for storing grouped anagrams A solid step forward in understanding how hashing + transformations can simplify complex grouping problems. Staying consistent and leveling up daily 💪 #LeetCode #Day2 #Python #DSA #CodingJourney #ProblemSolving
Like Comment
To view or add a comment, sign in

1,319 followers

38 Posts

View Profile Connect

Optimizing API Latency with Asyncio in Python

More Relevant Posts

Explore related topics

Explore content categories