Concurrency & Threading in Python: How to Achieve Real Parallelism (Without the Usual Pitfalls)
Concurrency is one of those topics every backend engineer thinks they understand until race conditions, idle workers, or mysterious performance bottlenecks show up in production.
Over time, I’ve learned that most concurrency problems aren’t caused by threads themselves, but by how work is claimed, scheduled, and coordinated.
This post breaks down practical techniques to achieve safe, scalable concurrency in Python, especially for I/O-bound systems without overengineering.
The Most Common Concurrency Anti-Pattern
A very typical design looks like this:
It feels parallel, but in practice:
Concurrency isn’t about batching. It’s about continuous flow.
The Key Idea: Persistent Workers, Not Batch Jobs
A far more effective model is persistent polling workers:
This creates a system where:
Think of workers as always-on consumers, not scheduled batch processors.
The Real Problem: Race Conditions
As soon as multiple threads or processes fetch work from a shared database, race conditions appear:
Two workers select the same rows. Both believe they own the same task. The same work gets processed twice
This isn’t a threading bug; it’s a data ownership problem.
The Solution: Atomic Work Claiming
Instead of:
You must claim work atomically.
In PostgreSQL, the most powerful (and underused) tool for this is:
FOR UPDATE SKIP LOCKED
What it gives you:
Multiple threads or even multiple services can safely run the same query concurrently without collisions.
This single technique eliminates:
“But Python Has the GIL…”
Yes—and it matters far less than people think.
The Global Interpreter Lock only blocks CPU-bound Python bytecode.
Most real-world backend systems are:
Recommended by LinkedIn
All of these release the GIL.
If your workload is:
Then threading gives you near-real parallelism with far less complexity than multiprocessing.
When threading is a great choice
When it’s not
Why Batch Size = 1 Often Wins
Counterintuitive but true:
Claiming one task at a time per worker usually outperforms batch processing.
Why?
The cost (more DB round-trip tickets) is usually negligible compared to the gains.
Sleep Only When Idle
Another subtle performance killer is sleeping after every cycle.
Bad pattern:
Better pattern:
This alone can reduce latency by seconds under load.
Connection Pools Are Non-Negotiable
Threading without a database connection pool is asking for trouble.
A proper pool:
Each thread borrows a connection briefly and returns it no shared connections, no leaks.
Thread Safety Isn’t About Threads
The biggest takeaway:
Thread safety is mostly a data problem, not a threading problem If:
Then your application code becomes dramatically simpler.
No global locks, no complex coordination, no fragile in-memory state
Just workers doing work.
Key takeaways: