Postgres Queue Architecture Avoids VACUUM Bloat with TRUNCATE

Every Postgres queue built on SKIP LOCKED + DELETE eventually turns into a VACUUM problem. You ship it, it works for a week, and then dead tuples start piling up. Index bloat. Autovacuum chasing its tail. The dashboard that was green last Tuesday is suddenly the reason you're on a call at 9pm. Nikolay Samokhvalov just shipped PgQue, a resurrection of PgQ (the queue architecture Skype built for messaging hundreds of millions of users back in the day). Pure PL/pgSQL. One SQL file. pg_cron to tick. No sidecar daemon. The trick is snapshot-based batching plus TRUNCATE-based table rotation instead of per-row deletes. Rotate partitions, truncate the old one, done. No bloat because there are no dead tuples to clean up. The tradeoff is end-to-end delivery latency in the 1-2 second range, which for plenty of workloads is fine. https://lnkd.in/gC6HTbfP This is the kind of thing I love about the Postgres ecosystem. Someone looked at an architectural pattern that's been quietly battle-tested for a decade, noticed the zero-bloat property, and packaged it as SQL you can read in an afternoon. No new infrastructure. No vendored runtime. Just the database you already run. What's your current queue setup looking like? #PostgreSQL #DatabaseEngineering #Backend #OpenSource

GitHub - NikolayS/pgque: PgQue – Zero-bloat Postgres queue. One SQL file to install, pg_cron to tick. github.com

1 Comment

Trevor I. Lasn 1w

i moved my job queue out of postgres for exactly this reason. running it as a redis sorted set with priority scores beats fighting autovacuum every week. for jobs that can wait a second or two though, PgQue with the partition rotate trick looks like the cleanest fix i've seen.

To view or add a comment, sign in

More Relevant Posts

Rene' Cannao'
6d
Report this post
🐘 PgBouncer is great — but it’s not the whole story. If you run PostgreSQL, chances are you’re using PgBouncer for connection pooling. It’s simple, efficient, and does one thing very well. But at some point, you start hitting limitations: - no query routing - no read/write split - no visibility into traffic - limited control beyond pooling That’s exactly why we wrote this post: 👉 moving from PgBouncer to ProxySQL (for PostgreSQL) ProxySQL is not just a pooler. It’s a SQL-aware proxy that can: - route queries based on rules - split reads/writes - multiplex connections - integrate with HA setups - provide observability So the real question becomes: 👉 when is PgBouncer enough, and when do you need more? This post from Rahim Kanji is the first in a series exploring that transition. 📖 https://lnkd.in/g9H3uVuh Curious to hear from PostgreSQL users: are you hitting limits with PgBouncer? or is it still “good enough” for your use case? #PostgreSQL #PgBouncer #ProxySQL #DevOps #SRE #Database #OpenSource

Part 1 - PgBouncer to ProxySQL: Rethinking the PostgreSQL Middle Tier — ProxySQL Blog proxysql.com

5 Comments
Like Comment
To view or add a comment, sign in
Max Millien
1w
Report this post
Do you really need Kafka… or just Postgres? PgQue is a zero-bloat job queue built entirely inside Postgres. No extra infra, no brokers: just one SQL file + pg_cron. What stands out: • Runs where your data already lives • Minimal setup, easy to reason about • Great fit for small to mid-scale async workloads Not everything needs a distributed system. Sometimes the simplest stack wins. Check it out: https://lnkd.in/eRzdTC-K #Postgres #Backend #DevOps #SoftwareEngineering #Architecture #Simplicity

GitHub - NikolayS/pgque: PgQue – Zero-bloat Postgres queue. One SQL file to install, pg_cron to tick. github.com
Like Comment
To view or add a comment, sign in
Avanish Mani Tripathi
3w
Report this post
Database Sharding & Partitioning Strategies for 1M+ QPS – What Actually Works in Production 📊 After tuning databases that crossed 1M+ queries per second, I realized one hard truth: “Just add more replicas” is a myth at real scale. You need smart sharding + partitioning designed from day one. Here’s the practical decision framework I use in production: Sharding Strategies – When to Choose What: Range Sharding → Perfect for time-series data, logs, or sequential IDs (e.g., orders by order_date) Hash Sharding → Best for even distribution on high-cardinality keys like user_id, session_id, or tenant_id Composite / Directory-based → When you need both flexibility and low-latency routing PostgreSQL Declarative Partitioning (Still a Game-Changer in 2026): PostgreSQL’s native partitioning has matured beautifully. My go-to patterns: Range Partitioning — Time-based data + easy archiving (monthly/weekly) List Partitioning — Status, region, or category-based queries Hash Partitioning — Massive tables needing even row distribution My Real-World Checklist Before Sharding Anything: 1. Max out connection pooling, indexes, and query tuning first 2. Choose a shard key that covers 80%+ of your query patterns 3. Always plan for future re-sharding (it will happen) 4. Use native partitioning as long as possible — go to Citus or Vitess only when you need true horizontal distribution across nodes 5. Maintain a global lookup / routing table — never do blind hashing in the application layer Pro Tip: Partition pruning is your best friend. Make sure your most frequent WHERE clauses include the partition key. Backend & Database engineers — what sharding or partitioning strategy actually saved (or broke) your system at scale? Drop your war stories below 👇 Let’s exchange real architecture lessons! #DatabaseOptimization #PostgreSQL #Sharding #Partitioning #HighScaleSystems #SystemDesign #BackendDevelopment #Citus #JavaBackend #SeniorDeveloper #SpringBoot
Like Comment
To view or add a comment, sign in
Rahim Kanji
6d
Report this post
🧩PgBouncer is simple. The stack around it usually isn't. That's not a criticism of PgBouncer. It does its job well. The issue is that, in a lot of production PostgreSQL environments, pooling is only one part of the story. Before long, teams are also adding pieces for: → read/write routing → failover awareness → traffic control and query rules → observability → caching That's usually when the conversation changes. You're no longer just choosing a pooler. You're deciding whether it still makes sense to keep all of that logic spread across scripts, sidecars, and application code, or move more of it into a proxy layer that was built to handle it. 📝 That's what Part 1 of this new series is about — where PgBouncer fits well, where the operational glue starts to build up, and why ProxySQL becomes worth a serious look once you need more control, visibility, and flexibility in front of PostgreSQL. 💬 If you're using PgBouncer today, what's the biggest thing you're still managing outside the pooler? 👉 Part 1 — PgBouncer to ProxySQL: Rethinking the PostgreSQL Middle Tier 🔗 https://lnkd.in/dbd_xRVg #PostgreSQL #ProxySQL #PgBouncer #DatabaseEngineering #SRE

Part 1 - PgBouncer to ProxySQL: Rethinking the PostgreSQL Middle Tier — ProxySQL Blog proxysql.com
Like Comment
To view or add a comment, sign in
MyDBA.dev

33 followers
2w
Report this post
By default, PostgreSQL does not log slow queries. The `log_min_duration_statement` parameter ships at `-1` (disabled), which means every performance regression happens in the dark until a user complains or an application timeout fires. The worst part? By the time you notice, the query has been degrading for months. A query that took 50ms six months ago now takes 500ms, and the root cause is buried under months of schema changes and data growth. That is what silent performance drift looks like. Three things I see teams get wrong with slow query monitoring: **1. The threshold is either too high or too low.** Setting `log_min_duration_statement` to 10 seconds catches only catastrophic queries. Meanwhile, a stream of 1-2 second queries collectively dominates your database load. Start at 250ms for transactional workloads -- it captures meaningful slowness without flooding logs. **2. They optimize outliers instead of total load.** A single 800ms query is less important than a 50ms query running 10,000 times per hour (500 seconds of total load). Use `pg_stat_statements` and sort by `total_exec_time`, not `max_exec_time`. The standard deviation column (`stddev_exec_time`) reveals plan instability -- queries that are sometimes fast and sometimes slow. **3. They never enable auto_explain.** `log_min_duration_statement` tells you which queries are slow. `auto_explain` tells you why. Set `auto_explain.log_min_duration = 500ms` to automatically capture execution plans for slow queries. Set `log_analyze = off` in production to avoid doubling execution cost -- the estimated plan is enough for diagnosis. Slow query logging, pg_stat_statements, and auto_explain form a three-layer observability stack that catches regressions before users notice. I wrote a practical guide with the exact configuration, detection queries, and a prevention strategy: https://lnkd.in/exV6FcAq #PostgreSQL #DatabasePerformance #SlowQueries #DBA #DevOps #SoftwareEngineering
Like Comment
To view or add a comment, sign in
Rene' Cannao'
2w
Report this post
When we took over dbdeployer, it was fundamentally a MySQL tool. Every code path assumed: mysqld, my.cnf, CHANGE MASTER TO. Adding PostgreSQL forced a bigger question: 👉 how do we make dbdeployer database-agnostic? The answer was introducing a Provider interface. Not focused on configuration — but on lifecycle: - create - start - stop - replicate The interesting part is that MySQL and PostgreSQL are completely different internally… …but from this perspective, they fit the same shape. PostgreSQL was the real test. And once that worked, something else became obvious: 👉 adding new flavors becomes trivial VillageSQL, for example, required changes to just a few files and reused MySQL capabilities entirely. This post is a deep dive into that architecture. If you build tools that support multiple backends, this pattern is worth thinking about. 📖 https://lnkd.in/gnrdd5De #dbdeployer #MySQL #PostgreSQL #ProxySQL #Engineering #OpenSource #DevTools

The Provider Architecture — How dbdeployer Learned to Speak PostgreSQL — ProxySQL Blog proxysql.com

3 Comments
Like Comment
To view or add a comment, sign in
VillageSQL

451 followers
2w
Report this post
VillageSQL has been added to dbdeployer by Rene' Cannao'! Working to make sure that VillageSQL is available everywhere that folks want #MySQL.

Rene' Cannao'

CEO at ProxySQL | Database Routing, Failover, and Observability for MySQL and PostgreSQL
2w

When we took over dbdeployer, it was fundamentally a MySQL tool. Every code path assumed: mysqld, my.cnf, CHANGE MASTER TO. Adding PostgreSQL forced a bigger question: 👉 how do we make dbdeployer database-agnostic? The answer was introducing a Provider interface. Not focused on configuration — but on lifecycle: - create - start - stop - replicate The interesting part is that MySQL and PostgreSQL are completely different internally… …but from this perspective, they fit the same shape. PostgreSQL was the real test. And once that worked, something else became obvious: 👉 adding new flavors becomes trivial VillageSQL, for example, required changes to just a few files and reused MySQL capabilities entirely. This post is a deep dive into that architecture. If you build tools that support multiple backends, this pattern is worth thinking about. 📖 https://lnkd.in/gnrdd5De #dbdeployer #MySQL #PostgreSQL #ProxySQL #Engineering #OpenSource #DevTools

The Provider Architecture — How dbdeployer Learned to Speak PostgreSQL — ProxySQL Blog proxysql.com
Like Comment
To view or add a comment, sign in
Michael Shumilov
2w
Report this post
🚀 SequelPG v0.11.1 is live If you work with PostgreSQL every day, this will feel familiar: You run a query You tweak it You come back to something from yesterday You try to remember what actually worked Most tools treat query history as just a log. I don’t think that’s enough. In this release, I rebuilt Query History from scratch. Now it’s something you actually use: Quickly find past queries Reuse them without rewriting Debug faster with less context switching I also refactored the Database Tools layer. You won’t “see” most of it — but you’ll feel it: More consistency Better performance Stronger foundation for what’s coming next I’m not trying to add more features. I’m trying to reduce friction when working with data. Full release notes: https://lnkd.in/dFmaV_xH If you use PostgreSQL, I’d really value your feedback. #PostgreSQL #DeveloperTools #IndieHacker #BuildInPublic #SwiftUI #DX

SequelPG v0.1.11: database tools, query history, and a catalog-query security pass sequelpg.com
Like Comment
To view or add a comment, sign in
Nik Samokhvalov
2w Edited
Report this post
PgQue v0.1.0 is out. PgQ -- the Postgres queue system built at Skype 20 years ago for 1B-user-scale workloads -- repackaged for the managed-Postgres era. One SQL file. No C extension. No external daemon. pg_cron to tick. Why bother reviving a 2007 architecture? Every major Postgres queue in production today uses some flavor of SKIP LOCKED + UPDATE/DELETE. It works under light load. When you have more data and higher load, it degrades predictably. Then you get posts like these: - Brandur at Heroku, 2015: 60k job backlog in one hour from a single open transaction - PlanetScale, 2026: death spiral at 800 jobs/sec - River issue #59, awa issue #169 and so on, Oban's partitioning work, PGMQ's autovacuum tuning guide and duct-taping with pg_partman The core issue is how Postgres MVCC is implemented and how we deal with it. Dead tuples in the hot path, xmin horizon pinned, vacuum falling behind, query performance quickly degrades. This happens every time you run pg_dump, execute an analytical query, or have a lagging/unused logical replication slot. PgQ solved this in 2007 with snapshot-based batching and TRUNCATE rotation -- zero dead tuples in the event path, by design. But PgQ needed a C extension and an external daemon. Which means it doesn't run on RDS, Aurora, Cloud SQL, AlloyDB, Supabase, or Neon -- i.e., where most Postgres lives now. PgQue closes that gap. 💎 Pure SQL + PL/pgSQL (PgQ engine) 👩💻 \i sql/pgque.sql -- you're done 🕑 pg_cron replaces pgqd (optional, recommended) 💻 Python, Go, TypeScript client examples shipped 💙 Apache 2.0 Trade-off: end-to-end event delivery latency is up to a second, it depends on ticking frequency. If you need sub-3ms job dispatch, use River, Oban, or graphile-worker (and avoid anything that blocks xmin horizon). If you need high-throughput event streaming with fan-out inside Postgres -- Kafka-shaped, without Kafka and dealing with transactional outbox implementation -- this is the right shape of tool. Kudos to Marko Kreen and Skype engineers who implemented this decades ago, for the original PgQ, and to Alexander Kukushkin whose recent "Rediscovering PgQ" talk brought this quiet corner of the Postgres ecosystem back into view. Stars, issues, PRs, and honest criticism all welcome. Link 👇

6 Comments
Like Comment
To view or add a comment, sign in
Rene' Cannao'
4w
Report this post
We’ve been heads-down on dbdeployer for a while. Today I want to share a first preview of what’s coming. When we took over the project, the goal wasn’t just to maintain it — it was to evolve it. From a MySQL sandbox tool… into something closer to a database infrastructure platform. The new release (v2.1.1) is already a big step in that direction: - Support for modern MySQL versions (8.4, 9.x) - PostgreSQL support (from scratch) - Replication topologies in a single command - More advanced setups like InnoDB Cluster and ProxySQL integration What I find interesting is not just the features. It’s the shift in how we think about local environments. Spinning up realistic database topologies should be: → fast → reproducible → disposable → version-aware That’s what we’re building toward. This post is just an overview — we’ll go much deeper in the next articles (provider architecture, PostgreSQL internals, cluster topologies, CI, etc). Repo: https://lnkd.in/exCfVtGr Full story: https://lnkd.in/e7v3BKW6

What's Coming to dbdeployer — ProxySQL Blog proxysql.com

5 Comments
Like Comment
To view or add a comment, sign in

6,575 followers

View Profile Connect

Postgres Queue Architecture Avoids VACUUM Bloat with TRUNCATE

More from this author

Two Kinds of Database Performance Problems

Why Postgres Writes 3x More Than You Asked It To

What the ClickHouse Benchmark Doesn't Tell You

Explore content categories