Name: Building Pipekit: A Simplified Apache Airflow in Python | Aditya kumar singh posted on the topic | LinkedIn
Uploaded: 2026-04-23T17:58:54.148Z
Duration: 5 min 36 s
Channel: Aditya kumar singh

Aditya kumar singh

I spent the last few weeks building Pipekit — a distributed data pipeline orchestrator from scratch in Python. Here's what I built and what I learned WHAT IS PIPEKIT? A simplified Apache Airflow — built from first principles to deeply understand how pipeline orchestrators actually work under the hood. You define tasks with a decorator: @task(retries=3) def fetch_data(): return "raw_data" Express dependencies in one line: merge.depends_on(fetch_users, fetch_orders, fetch_products) And Pipekit handles the rest. WHAT IT CAN DO - DAG execution — Kahn's algorithm resolves dependencies automatically - True parallelism — tasks in the same wave run across multiple Celery workers simultaneously - State machine — every task tracked: pending → running → success / failed - Persistent state — full audit trail in PostgreSQL - Exponential backoff retry — 2s, 4s, 8s between attempts - Artifact passing — task outputs flow automatically to downstream tasks - REST API — trigger pipelines over HTTP (FastAPI) - CLI tool — pipekit run, pipekit status - Cron scheduler — pipelines run automatically on a schedule - Cycle detection — raises an error immediately for circular dependencies PROOF OF PARALLELISM 3 tasks × 2 seconds each: Sequential → 6.2 seconds Pipekit → 2.04 seconds ForkPoolWorker-7, ForkPoolWorker-8, and ForkPoolWorker-1 all picked up tasks at the same timestamp. That's real parallelism — not threads, actual separate processes. TECH STACK - Python (core) - FastAPI (REST API) - PostgreSQL (persistent state) - Redis (message broker) - Celery (distributed workers) - APScheduler (cron scheduling) - Click (CLI) THE BIGGEST THING I LEARNED Reliability in distributed systems comes from state. Without persistent state — a crash means you lose everything. You don't know what ran, what failed, where to restart from. With state — you can observe, recover, and retry. That's the insight behind every production orchestrator. Every database. Every message queue. It's all about managing state reliably. Building this from scratch taught me more about distributed systems than any course ever did. If you're learning backend or distributed systems — I highly recommend building something like this from scratch. You understand it on a completely different level when you've written every line yourself. Project is on GitHub 👇 https://lnkd.in/gHx4Rtjp website link - https://lnkd.in/g4xC5RsN #Python #DistributedSystems #BackendEngineering #BuildInPublic #SoftwareEngineering #DataEngineering #OpenSource

To view or add a comment, sign in

More Relevant Posts

Divine Owai
2w Edited
Report this post
Just shipped a distributed systems project And honestly i learned more debugging in this than in any project I've ever done 😅 I built a search recommendation engine from scratch, the kind of system that tracks what you click on and uses that data to re-rank future search results in real time The stack: → FastAPI for the APIs → Kafka as the message queue (click events flow through here) → PostgreSQL to store click scores → Redis to cache hot results → Docker Compose to wire everything together The whole flow looks like this: user clicks a result → event goes to Kafka → stream processor reads it → updates scores in postgres → next search returns re-ranked results with Redis serving the hot ones in <1ms sounds clean right? it was NOT clean getting there at all 😭 The issues i ran into (and how i fixed them): 1. Kafka advertising the wrong address to my Python service which kept getting a "connection refused" error, even though Kafka was running. Turned out Kafka was advertising itself as localhost:9092, which inside Docker means the container's own localhost, not the network address other containers could reach. fixed it by setting KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092 so containers find each other by service name. 2. "depends_on" doesn't mean "ready" Docker's depends_on only waits for a container to start, not for the service inside to be ready. Kafka takes 20-30 seconds to fully boot. My stream processor was connecting before Kafka was ready and crashing. I fixed it with proper Docker healthchecks + retry logic in Python that actively probes Kafka every 5 seconds instead of blindly sleeping. 3. WSL2 dropping me into Docker's internal VM when i typed wsl it dropped me into Docker Desktop's internal Linux VM instead of a real Ubuntu distro. everything was broken — no sudo, no apt, wrong environment. Had to install Ubuntu properly via wsl --install -d Ubuntu and work from there. 4. Running Python outside Docker: I spent a while confused why my local Python couldn't reach Kafka. The fix was containerising the Python services too so everything runs on the same Docker network. Learned the hard way that localhost means something completely different inside vs outside a container. The most satisfying moment was watching the metrics endpoint show: cache hit rate: 57% kafka consumer lag: 0 cache HIT latency: 0.6ms vs cache MISS: 2.3ms That 12x latency difference between Redis and PostgreSQL is real and measurable not just theory anymore. Would apply this to an actual application next... Link to repo: https://lnkd.in/eDt65PvS #softwareengineering #distributedsystems #kafka #redis #python #docker #buildinpublic #devops Alright keep scrollingggg :)
9 Comments
Like Comment
To view or add a comment, sign in
GizmoData

385 followers
3w Edited
Report this post
dbt-gizmosql — a month of new capabilities Six releases shipped for dbt-gizmosql, the dbt adapter for GizmoSQL (an Apache Arrow Flight SQL engine backed by DuckDB). The headline: the adapter now supports a lot of things it just... didn't before. → Python models (brand new!) Write dbt transformations in Python. dbt.ref() / dbt.source() pull upstream tables as Arrow; return a DuckDB relation, pandas DataFrame, or PyArrow Table and the result is shipped back to the server via ADBC bulk ingest. Incremental Python models supported too. → session.remote_sql() — server-side pushdown for Python models Python models run client-side, so dbt.ref('big_table') streams the whole upstream table across the wire before your code sees it. The new remote_sql() escape hatch runs SQL directly on GizmoSQL and returns only the result — the filter executes server-side: def model(dbt, session): schema = dbt.this.schema return session.remote_sql( f"select * from {schema}.big_table where name = 'Joe'" ) → External materialization — write straight to files A new 'external' materialization issues a server-side COPY to Parquet, CSV, or JSON. Anywhere the server's DuckDB can reach: local disk, s3://, gs://, azure://, MinIO. Supports partitioning, codecs, and format inference. The result is ref()-able downstream. → Microbatch incremental strategy Time-windowed incrementals via dbt's microbatch strategy — reprocess a recent event_time window each run, with automatic batching. → Snapshot merge rewritten around MERGE BY NAME Snapshots now use DuckDB's native MERGE ... UPDATE / INSERT BY NAME — more robust to column reordering, far clearer than a hand-rolled merge. → Much faster seed loading Seeds are now read with DuckDB's CSV reader (correct null handling, proper type inference) and bulk-ingested as Arrow via ADBC instead of row-by-row INSERTs. A shout-out to ADBC None of this would be practical without ADBC (Arrow Database Connectivity). It gives the adapter a columnar, zero-copy path to the server: seeds and Python-model results ship as Arrow record batches, ref() pulls land as Arrow tables, and remote_sql() streams Arrow results back. It's the reason the adapter can move real data volumes without the usual row-by-row ODBC/JDBC tax. Huge thanks to the Apache Arrow community and the adbc-driver-flightsql maintainers. And finally — the best GizmoSQL features come from the user community. GizmoData thanks our users for their great feedback and engagement! pip install dbt-gizmosql https://lnkd.in/ewKGMUCe #dbt #duckdb #dataengineering #apachearrow #adbc #flightsql

GitHub - gizmodata/dbt-gizmosql: A dbt adapter for GizmoSQL - with the DuckDB engine github.com
Like Comment
To view or add a comment, sign in
Pratik Rauniyar
3w Edited
Report this post
I recently expanded a Packt open-source microservices codebase (Flask + Python) and turned it into a deeper learning project around distributed systems architecture. Here's what I built on top of the original: → Gave each service its own isolated SQLite3 database (User, Product, Order) - no shared tables, no hidden coupling → Extended the Order Service schema to support order items with unit price snapshots, so historical accuracy is preserved even if prices change later → Designed schemas that can scale independently: each service owns its data contract and can swap to PostgreSQL with zero impact on the others → Containerized all 4 services with Docker Compose, each with its own volume mount for persistence The thing that clicked for me while doing this: A service boundary is a change boundary. If two things always deploy together, change together, and break together, they're not two services. They're one service pretending to be two, with extra network hops and failure points in between. I also learned why bad decomposition is worse than a monolith. Splitting by technical layer (DatabaseService / APIService / LogicService) adds distributed complexity with zero business value. The right split is always along business capabilities - what the business actually does, not how the code is organized. Concepts I got hands-on with: - Database per Service pattern - Service-Oriented Architecture (SOA) - Synchronous (REST) vs asynchronous (event-driven) service communication - Bounded contexts from Domain-Driven Design - Independent deployment pipelines per service If you're getting started with microservices, this pattern reference from Chris Richardson is the best mental model I've found: https://lnkd.in/gKcyNcEj Original repo from PacktPublishing: https://lnkd.in/gENvsSej The version I worked on: https://lnkd.in/gf_HxcyX Building this from scratch (well, expanding it from scratch) made distributed systems a lot less abstract. Highly recommend it as a learning project if you're coming from a Flask/FastAPI background. #Microservices #Python #Flask #Docker #DistributedSystems #SystemDesign #BackendDevelopment #SoftwareEngineering #LearningInPublic
Like Comment
To view or add a comment, sign in
Tusif Ahammed
3w Edited
Report this post
🚀 How I Finally Understood What “Knowing Python” Means for Data Engineering: A few years ago, I thought learning Python for data engineering meant memorizing syntax, solving coding challenges, and building side projects. But the first time I had to build a real data pipeline… reality hit me hard. I realized Python wasn’t just a language. It was the glue holding the entire data ecosystem together. Let me tell you the version I wish someone had told me on day one. 🧠 Lesson 1: Data Lives on Disk, but Work Happens in Memory: I remember loading a “small” CSV file and watching my laptop freeze. That’s when I learned the golden rule: Data sits on disk, but Python processes it in memory. RAM is limited. Your data probably isn’t. That’s why tools like Spark, DuckDB, and Polars exist — they help you work with data bigger than your laptop can handle. 🐍 Lesson 2: Python Basics Actually Matter: I used to skip the fundamentals. Big mistake. In real pipelines, you rely on: Lists and dictionaries to structure data Loops and comprehensions to transform it Functions to keep your code clean Classes when things get complex Exception handling so your pipeline doesn’t explode at 2 AM These aren’t “beginner topics.” They’re survival skills. 🔄 Lesson 3: ETL/ELT Is Where Python Shines: Once I understood ETL, everything clicked. Python helps you: Extract data from APIs, databases, cloud storage Transform it using Pandas, Polars, Spark, or SQL Load it into warehouses like Snowflake or BigQuery It’s not about writing fancy scripts. It’s about moving data from chaos → clarity. 🧰 Lesson 4: The Tools Become Your Superpowers: I used to think data engineers only needed Python. Then I met: psycopg2 for databases boto3 for AWS requests for APIs Parquet, JSON, CSV, XML Kafka, Kinesis, SFTP Suddenly, Python felt less like a language and more like a Swiss Army knife. ✔️ Lesson 5: Data Quality Is Non‑Negotiable: Your pipeline isn’t done when it runs. It’s done when the data is trustworthy. Tools like: Great Expectations Cuallee help you validate data before anyone sees it. 🧪 Lesson 6: Tests Save You From Yourself: The first time I broke production, I learned this the hard way. pytest became my best friend. Tests catch bugs before your users do. ⏱️ Lesson 7: Schedulers & DAGs Are the Real Magic: Pipelines don’t run because you press “Run.” They run because schedulers like: Airflow Dagster Cron wake them up at the right time. And DAGs make sure everything runs in the right order. 🎯 Final Thought : Learning Python for data engineering isn’t about mastering every feature of the language. It’s about understanding how Python connects systems, moves data, and keeps pipelines reliable. Once you see Python as the orchestrator of your data world, everything changes.
Like Comment
To view or add a comment, sign in
Ugonna Udunwa, Ph.D.
2w
Report this post
Data engineers, have you stopped learning Python after loops and functions? Well, basic Python alone doesn't cut it for enterprise data architecture. While PySpark and Databricks handle the heavy distributed scaling, you still need strong Python engineering skills to build clean, resilient, and maintainable systems. Below are six high-impact advanced Python concepts that help bridge the gap between quick scripts and production-grade pipelines: Step 1: Object-Oriented Programming Use classes and inheritance to create reusable components like custom data transformers, source connectors, or pipeline builders. This keeps your PySpark code DRY and easier to extend without repeating logic everywhere. Step 2: Pydantic for Data Validation Python doesn't enforce schemas natively. When ingesting raw JSON, APIs, or messy files, Pydantic lets you define strict models that validate and parse data early before it lands in your bronze layer. This catches errors upfront and improves overall data quality. Step 3: Pytest for Automated Testing Debugging in production (or even in Databricks notebooks) is painful. Write unit and integration tests for your functions, transformations, and edge cases. Run them in your CI/CD pipeline (Azure DevOps or Databricks Workflows) before deployment. Step 4: Clean Configuration & Error Handling Move beyond hard-coded values. Use Pydantic Settings or environment-based configs. Combine this with robust logging, retries, and exception handling so your pipelines don't just crash on the first unexpected issue. Step 5: Concurrency Tools (When Appropriate) For I/O-heavy tasks like making many API calls to Azure Data Lake, external services, or Oracle, use asyncio + httpx (asynchronous requests) or ThreadPoolExecutor for parallel processing. Important caveat: These are best for driver-side or ingestion steps. For core data transformations on large datasets, rely on Spark’s native parallelism (partitions, mapPartitions, etc.) instead of fighting Python’s GIL with heavy threading. Step 6: Advanced API & Integration Patterns Master secure JWT/OAuth handling, pagination, retries (with backoff), and rate-limit management when pulling from external systems. Tools like httpx (async) or requests.Session make this reliable and efficient. Mastering these areas (plus deep PySpark knowledge lazy evaluation, partitioning, Delta Lake, Spark UI tuning) turns your code from fragile scripts into resilient data architecture.
Like Comment
To view or add a comment, sign in
Leandro Silveira
1mo
Report this post
2026: Data Engineering ecosystems are becoming parallel — and that's a great thing. After years of Apache Spark dominating almost everything, we're witnessing a healthy and productive fragmentation in the data processing landscape. Each language is building its own optimized path, with distinct strengths: Java Spring Batch remains the king of enterprise-grade orchestration. Its fine-grained reusability of Readers, Processors, Writers, retry logic, skip policies, listeners, and job repository is still unmatched. When massive horizontal scale is needed, Spark (via Tasklet or spark-submit) steps in. A classic, reliable combination still widely used in heavy corporate environments. Python The focus here is on modern orchestration: Airflow (mature at scale), Dagster (asset-centric), and Prefect (lightweight and developer-friendly). Heavy lifting is still handled by PySpark when volume demands it, but more teams are migrating "medium data" workloads to Polars — which delivers absurd performance with far less infrastructure. Rust This is where things get really interesting. - Single-node / vertical scaling: Polars and DataFusion often outperform Pandas and even Spark in many scenarios — with dramatically lower memory usage, no GC pauses, and brutal vectorized execution. - Orchestration: spring-batch-rs (v0.3.1) brings the spirit of Spring Batch to Rust, with chunk-oriented processing and strong component reusability. - Horizontal scaling: We still don't have a fully mature open-source "Spark Driver" equivalent, but we're getting close. Sail (LakeSail) provides a Rust-native backend compatible with Spark Connect — you keep almost the exact same PySpark code with no rewrites and gain ~4x speed with significantly lower costs. Polars Cloud now has its distributed engine in GA on AWS (with open beta for horizontal scaling and fault tolerance), and Ballista continues evolving as a pure Rust option. Bottom line: We're no longer in a "Spark for everything" world. We're building parallel ecosystems, each optimized for its own philosophy: - Java → enterprise stability and component reusability - Python → developer productivity and rich ecosystem - Rust → extreme performance, resource efficiency, and memory safety The coolest part? They don't have to compete directly. Many teams are adopting hybrid approaches: orchestrate with Spring Batch or Prefect, and delegate heavy processing to Sail or Polars Cloud. Which ecosystem are you using (or migrating to) in 2026? Have you tried Sail with Spark Connect yet? Or Polars Cloud for workloads that used to require a Spark cluster? I'd love to hear your experiences in the comments 👇 #DataEngineering #Rust #Polars #ApacheSpark #Sail #DataPlatforms #ETL
Like Comment
To view or add a comment, sign in
Marcos Vinicius Thibes Kemer
3w
Report this post
✅ #PythonJourney | Day 152 — All API Endpoints Tested & Production Ready Today: Comprehensive endpoint testing. The entire URL Shortener API is now fully operational! Key accomplishments: ✅ Tested 4 critical endpoints: • POST /api/v1/urls → Creates shortened URL with auto-generated short code • GET /api/v1/urls → Returns user's URL list (ordered by newest first) • GET /api/v1/urls/{url_id} → Retrieves specific URL details • GET /{short_code} → Redirects to original URL + tracks click in database ✅ Fixed SQLAlchemy Click model: • Issue: Composite primary key (id + clicked_at) prevented autoincrement • Solution: Made id the sole primary key, clicked_at just a timestamp • Result: Click tracking now works perfectly ✅ Verified full request/response cycle: • Authentication: API key validation ✓ • Input validation: Pydantic models ✓ • Database operations: CRUD complete ✓ • Click tracking: Events recorded correctly ✓ • Response serialization: JSON output perfect ✓ ✅ Data flow confirmed: 1. User creates URL → Stored in PostgreSQL 2. User accesses short code → Redirect happens 3. Click event → Recorded in clicks table 4. URL counter → Incremented automatically 5. JSON response → Properly formatted What I learned today: → Comprehensive testing reveals edge cases early → SQLAlchemy's primary key behavior affects autoincrement → Docker image caching can hide recent code changes → Click tracking requires careful database schema design → Manual testing validates the entire architecture The API is now: - ✅ Accepting requests from multiple sources - ✅ Storing data reliably in PostgreSQL - ✅ Returning proper JSON responses - ✅ Tracking user behavior - ✅ Handling redirects correctly - ✅ Managing database transactions safely Endpoints remaining to test: - GET /api/v1/urls/{url_id}/analytics (analytics aggregation) - DELETE /api/v1/urls/{url_id} (soft delete) Status: API Core is production-ready. Ready for comprehensive test suite (pytest) next. This is what backend development looks like: build → test → debug → iterate → victory! #Python #FastAPI #API #Testing #Backend #PostgreSQL #Docker #SoftwareDevelopment #StartupLife
Like Comment
To view or add a comment, sign in
Abdulai Tamba Lebbie
2w
Report this post
Building a multi-tenant dissertation management platform from scratch here is the architecture I chose and why. As a DevSecOps engineer, the hardest decisions are not about writing code. They are about where things live, who owns them, and how they talk to each other. Here is what I built and the reasoning behind every decision: The Stack FastAPI + SQLAlchemy (async) + PostgreSQL + pgvector + Alembic + Docker Multi-repo, single responsibility Each service — API, AI, workers, frontend — lives in its own repo. Not a monorepo. Not "let's figure it out later." Clear ownership from day one. One service owns the database Only `akep-api` connects to PostgreSQL directly. The AI service and workers call the API — they never touch the database. This is not laziness. It is the difference between a system that scales and one that collapses under its own complexity. Migrations as code, not scripts No raw SQL files executed manually. Every schema change goes through Alembic, committed to git, reviewed in a PR, and applied automatically by CI/CD. Local → Staging → Production. No exceptions. Least privilege by design Two database roles: - `akep_migrator` — runs schema changes during deployment only - `akep_app` — SELECT, INSERT, UPDATE, DELETE. Cannot touch schema. The running application never has migration privileges. Ever. 🌱 Three environments, one codebase Local (Docker), Staging (AWS RDS + GitHub Secrets), Production (AWS RDS + AWS Secrets Manager). The only thing that changes is `DATABASE_URL`. No code changes between environments. pgvector for AI Dissertation similarity search and AI validation results are stored as 768-dimension vectors directly in PostgreSQL. No separate vector database to manage, secure, or pay for at this stage. This is not over-engineering. Every decision has a reason: - Developers cannot break production - Schema changes are auditable and reversible - Secrets never touch the codebase - Every environment is reproducible I am building this while completing my Master's in Cybersecurity and AI — because the best way to learn senior-level engineering is to actually do it. If you are a junior developer or student reading this: **the architecture conversation happens before the first line of code.** Start there. *#DevSecOps #Python #FastAPI #PostgreSQL #CloudSecurity #SoftwareArchitecture #AWS #BackendDevelopment #Alembic #OpenToWork*
Like Comment
To view or add a comment, sign in
Sunil Sharma
3w Edited
Report this post
I just published spark-perf-lint to PyPI, the first dedicated Apache Spark performance linter for the Python ecosystem with built-in pre-commit hooks, CI/PR annotations, and deep audit capabilities. PyPI: https://lnkd.in/gu_qd5yB Live Webpage: https://lnkd.in/gje5sMac GitHub: https://lnkd.in/g6WF8-Yn pip install spark-perf-lint One command. That's all it takes for any PySpark team in the world to start catching performance anti-patterns before they reach production. There are 500,000+ PySpark projects on GitHub. Thousands of organizations run Spark ETL pipelines processing billions of rows daily. Yet until today, there was no dedicated Spark performance linter available on PyPI. We have linters for Python style, type safety, security, even framework-specific rules for Django and FastAPI. But the framework that processes more data than almost anything else in the enterprise? Nothing. Today that changes. What I've contributed to the ecosystem: → 93 Spark-specific performance rules — not generic Python lint. Every rule understands Spark internals: how the Catalyst optimizer works, when shuffles happen, what causes data skew, which join strategy Spark will choose and why. → 11 dimensions of coverage — cluster configuration, shuffle optimization, join strategy, partitioning, data skew, caching lifecycle, I/O and file formats, AQE tuning, UDF patterns, Catalyst optimizer, and monitoring gaps. This is the most comprehensive Spark performance rule set available anywhere — open source or commercial. → Every finding comes with a fix — not "consider optimizing." Actual before/after code. Specific config changes. Spark internals explanation of why the anti-pattern hurts. Estimated performance impact. Effort level to fix. → A complete knowledge base — 50+ Spark patterns with decision matrices. When to use broadcast join vs sort-merge vs bucket join. How to size partitions. When to cache vs checkpoint. This alone is worth reading even if you never run the linter. → Three tiers of analysis: Pre-commit hook, CI/PR analysis and Deep audit Why this matters at scale: A single missed .collect() without a filter can OOM your driver with 200M rows. A join on a low-cardinality column can create straggler tasks that run 50x longer than the median. A default spark.sql.shuffle.partitions=200 on a 500GB dataset creates 200 partitions of 2.5GB each guaranteed spills and GC pressure. These bugs don't show up in dev. They don't show up in code review. They show up at 2 AM when your production pipeline fails and the on-call engineer is staring at a Spark UI full of red. spark-perf-lint catches them at commit time. Before they ever run on a cluster. Before they cost compute. Before they wake anyone up. Try it today ! #ApacheSpark #PySpark #DataEngineering #OpenSource #Python #PerformanceEngineering #ETL #BigData #DevTools #PreCommit #PyPI #ClaudeCode #Fintech #DataPipelines

spark-perf-lint pypi.org

1 Comment
Like Comment
To view or add a comment, sign in
Ahmed Shehab
2w
Report this post
Scaling Python backends with asyncio and PostgreSQL (asyncpg) requires thinking beyond async/await syntax. If you don't map your coroutines to the underlying OS-level sockets and memory buffers, you will hit silent deadlocks, connection exhaustion, and OOM crashes. Spent a lot of time reading and building lately, and I wanted to share the most important aspects of building high-performance async database drivers. Here’s what I’ve learned: Throttle with asyncio.BoundedSemaphore: Don't just dump 10,000 tasks onto the event loop. Match your semaphore limit to your connection pool's max_size. This provides backpressure, preventing task queue timeouts and event loop thrashing. (Tip: Always use BoundedSemaphore over Semaphore to catch rogue .release() calls). Pipeline with executemany(): Stop running .execute() in a loop. executemany leverages the Postgres extended query protocol (PARSE once, BIND/EXECUTE many) to pack the TCP window and eliminate thousands of network Round Trip Times (RTT). Isolate State with Savepoints: Use nested async with conn.transaction() blocks to handle partial payload failures. When an inner block fails, it just flags the Postgres SubXID as aborted (leaving dead tuples for the VACUUM process) while allowing the parent transaction to safely commit. Prevent OOMs with Server-Side Cursors: Never use .fetch() for massive multi-million row exports. Stream them via async for row in conn.cursor(query, prefetch=chunk_size). This guarantees your Python process memory stays strictly bounded to the chunk size, no matter how large the table gets. Shield Your Cleanup: If a client abruptly drops an HTTP connection, the ASGI server will inject an asyncio.CancelledError. If you don't wrap your pool.release() and tx.rollback() in asyncio.shield() inside your Unit of Work, the network socket will be left permanently checked out, leading to a silent pool deadlock. Adopt asyncio.TaskGroup: (Python 3.11+) Move away from naked asyncio.gather(). TaskGroups provide structured concurrency—if one concurrent validation query fails, the siblings are safely and instantly cancelled, returning their leased connections to the pool immediately. Avoid Distributed Transactions: Don't attempt Two-Phase Commits (2PC) across microservices using the event loop; it destroys throughput. Rely on the Transactional Outbox pattern: commit your local database mutation and an event payload in the same transaction, and let your message broker manage eventual consistency. Stop treating the event loop like magic. Treat it like an I/O multiplexing coordinator. #Python #Asyncio #PostgreSQL #BackendEngineering #SoftwareArchitecture #DistributedSystems
Like Comment
To view or add a comment, sign in

871 followers

15 Posts

View Profile Follow

More Relevant Posts

Explore content categories