Scaling Async Python: How the Kernel, Network Stack, and epoll Manage Massive Connections
In Blog 1, we have already discussed how Python’s asyncio, file descriptors (FDs), and event loops enable massive concurrency without threads.
In this post, we go a layer deeper — into the kernel and network stack — and see how a single NIC (Network Interface Card) can serve millions of requests per second, with help from epoll.
Let’s break it down.
1. The Kernel’s Role: The Real Network Engine
Your async Python code never deals with network packets directly.
Here’s who does the real work:
NIC (Network Interface Card)
Kernel’s Network Stack (L3 & L4)
Socket Buffer
✅ Key Insight: All packet processing, validation, and reassembly are done in the kernel, before your Python code even wakes up.
2. Enter epoll: Scalable Notification, Not Packet Processing
Let’s clear this up: epoll (on Linux) or kqueue (on BSD/macOS) does not handle or touch packets.
Instead, they solve a different problem:
“How do I wait for thousands of sockets and know exactly which one is ready to read/write — without wasting CPU?”
Here’s how epoll works:
✅ What epoll does:
❌ What epoll does not:
epoll Workflow:
Recommended by LinkedIn
epoll Event Types (Bitmasks)
Here are the main flags you might see:
| Event | Meaning |
| ---------- | ---------------------------------------- |
| `EPOLLIN` | FD is readable (data available) |
| `EPOLLOUT` | FD is writable (buffer space available) |
| `EPOLLERR` | An error occurred |
| `EPOLLHUP` | Hang up (FD closed remotely) |
| `EPOLLET` | Edge-triggered mode (for advanced usage) |
Python’s asyncio and selectors module handle all of this for you.
3. 🔥 How a Single NIC Handles Millions of Requests
It sounds unbelievable at first — millions of requests per second on one NIC?
Here’s how it’s done:
💡 NIC Hardware Capabilities:
⚙️ Kernel Network Stack Optimizations:
⛓ Combined With epoll + asyncio:
4. Visualizing the Journey from NIC to Coroutine
[NIC Hardware]
↓
[Kernel Network Stack (TCP/IP)]
↓
[Socket Receive Buffer (linked to FD)]
↓
epoll watches FD readiness
↓
Kernel triggers epoll_wait wake-up
↓
Asyncio event loop resumes coroutine
↓
Your Python code reads data
Each connection is just:
5. Responsibilities Breakdown
LayerRoleNICHardware packet send/receive (Layer 2)KernelTCP/IP processing, buffering, connection stateepoll/kqueueNotify which FD is ready (no polling!)asyncioMap FDs to coroutines and resume themYour CodeHandles app logic after data is ready
✅ Final Thoughts
To summarize:
Understanding this stack helps you:
Until next time — keep your FDs clean, your loops tight, and your coroutines flying. And remember: by the time you await reader.read(), the kernel’s already done the heavy lifting.