Optimizing PostgreSQL Queries from 8s to 180ms

Our PostgreSQL queries were taking 8 seconds. I got them under 200ms. Here's exactly what changed and what didn't. Context: We were running a multi-tenant SaaS. As the user base grew, a core query started degrading badly. Users noticed. It had to be fixed. Step 1: EXPLAIN ANALYZE before touching anything -sql EXPLAIN ANALYZE SELECT * FROM bookings WHERE tenant_id = $1 AND status = 'confirmed' ORDER BY created_at DESC; Output showed: Sequential scan on 2M rows. No index on tenant_id + status. Step 2: Composite index -sql CREATE INDEX idx_bookings_tenant_status ON bookings (tenant_id, status, created_at DESC); This alone dropped the query from 8s to 900ms. Still not good enough. Step 3: Stop selecting * The query was pulling 40 columns. The UI needed 6. -sql SELECT id, title, start_time, end_time, status, attendee_id FROM bookings WHERE tenant_id = $1 AND status = 'confirmed' ORDER BY created_at DESC LIMIT 50; Down to 350ms. Step 4: N+1 was hiding in the ORM For each booking, the ORM was firing a separate query to fetch the attendee. Switched to a JOIN. -sql SELECT b.id, b.title, u.name as attendee_name FROM bookings b JOIN users u ON u.id = b.attendee_id WHERE b.tenant_id = $1 AND b.status = 'confirmed' ORDER BY b.created_at DESC LIMIT 50; Final result: 180ms. What didn't help: connection pooling tuning (already fine), adding a cache (premature fix the query first), switching databases (never needed to). 8 seconds → 180ms. No infrastructure changes. Just understanding what the database was actually doing. What's your go-to first step when a query goes slow? #PostgreSQL #SQL #DatabaseOptimization #QueryOptimization #BackendEngineering #FullStackDevelopment #SoftwareEngineering

To view or add a comment, sign in

More Relevant Posts

MyDBA.dev

33 followers
2w
Report this post
Every standard B-tree index lookup in PostgreSQL is a two-step process: scan the index to find row pointers, then fetch the actual row data from the heap table. That second step -- the heap fetch -- is the bottleneck. For queries returning many rows, each heap fetch is a random I/O operation scattered across the table. A covering index eliminates the heap entirely. If all the columns your query needs exist in the index, PostgreSQL reads everything from the compact, ordered index structure. No random heap access. No wasted I/O. Three things most teams miss about covering indexes: **1. INCLUDE is different from a composite index.** Before PostgreSQL 11, you had to create a composite index on all columns: `(customer_id, customer_name, customer_email)`. This sorts on all three columns even though nobody searches by customer_name. The `INCLUDE` clause adds columns to leaf pages without including them in the sort key: `CREATE INDEX ON customers (customer_id) INCLUDE (customer_name, customer_email)`. Smaller index, cleaner semantics. **2. Index-only scans depend on vacuum.** PostgreSQL can skip the heap only for pages marked "all-visible" in the visibility map. If vacuum falls behind, pages are not marked all-visible, and the planner falls back to regular index scans with heap fetches -- even with a covering index. A covering index without healthy vacuum is a wasted investment. **3. The performance gap widens dramatically with scale.** On a 100-million-row table, eliminating heap fetches can reduce query time from hundreds of milliseconds to single-digit milliseconds -- a 10-100x improvement. The larger the table and the more rows your query returns, the bigger the win. Look for `Index Scan` with high `Heap Fetches` in your EXPLAIN output. If you see `Heap Fetches: 1000` where an `Index Only Scan` with `Heap Fetches: 0` is possible, there is a covering index opportunity waiting. Practical guide with INCLUDE vs composite examples, dashboard query patterns, and vacuum considerations: https://lnkd.in/eFhhvyU6 #PostgreSQL #DatabasePerformance #Indexing #CoveringIndex #SoftwareEngineering #DevOps
Like Comment
To view or add a comment, sign in
MyDBA.dev

33 followers
3w
Report this post
If your PostgreSQL queries search inside JSONB, arrays, or text -- and you don't have GIN indexes -- every single one of those queries does a sequential scan. No exceptions. B-tree indexes handle scalar comparisons (equality, range, ordering). But they can't index values inside composite data types. A JSONB column with dozens of keys, an array of tags, a tsvector of text lexemes -- B-tree can't touch these. Without a GIN index, PostgreSQL reads every row in the table and evaluates the condition one by one. Three things most teams get wrong about GIN indexes: 1. Operator class choice matters. For JSONB, there are two options: the default jsonb_ops (supports @>, ?, ?|, ?&) and jsonb_path_ops (supports only @> but is 2-3x smaller and faster). If your queries only use containment checks (@>), jsonb_path_ops is the better choice. Create one with the wrong operator class and the index gets silently ignored. 2. fastupdate causes unpredictable latency. GIN indexes batch insertions into a pending list for efficiency. Most inserts are fast, but occasionally a query triggers a pending list flush, causing an unexpected slowdown. For consistent query latency, disable it: ALTER INDEX ... SET (fastupdate = off). The tradeoff is slower inserts. 3. LIKE '%pattern%' needs a trigram GIN index, not a regular index. B-tree indexes require a fixed prefix. Only a GIN index with pg_trgm's gin_trgm_ops operator class can accelerate leading-wildcard pattern matches and similarity searches. The rule is simple: every JSONB column queried with @> or ? needs a GIN index. Every array column queried with @> or && needs a GIN index. Every tsvector column needs a GIN index. Every text column searched with LIKE '%pattern%' needs a trigram GIN index. Build this into your schema design process. Don't wait for production complaints. Full guide with operator class comparison, full-text search setup, and performance tuning: https://lnkd.in/e4vWHuVb #PostgreSQL #GINIndex #JSONB #FullTextSearch #DatabasePerformance #DevOps
Like Comment
To view or add a comment, sign in
Daedeep Krishna Vuppala
3d
Report this post
Most developers use PostgreSQL indexes every day — but very few understand what's happening when you add more than one column to them. Let's fix that. Composite indexes (multi-column indexes) are one of the most misunderstood performance tools in Postgres. Here's the trap people fall into: -->You create this index CREATE INDEX idx_orders ON orders (user_id, status, created_at); -->Then you query this way and wonder why it's slow SELECT * FROM orders WHERE status = 'pending'; The query above won't use the index efficiently or may skip it entirely. Why? Because of column order. Postgres reads a composite index like a phone book: sorted first by last name, then first name. If you skip to "first name" without specifying "last name", you're flipping through the whole book. $$$The rule of thumb: put the most selective column first, and Postgres can use the index for any query that includes a prefix of the column list. --> This CAN use the index (prefix match) SELECT * FROM orders WHERE user_id = 42; -->This CAN also use it (full prefix chain) SELECT * FROM orders WHERE user_id = 42 AND status = 'pending'; -->This CANNOT efficiently use it (skips user_id) SELECT * FROM orders WHERE status = 'pending' AND created_at > now() - interval '7 days'; A few extra things worth knowing: • Equality conditions (=) should come before range conditions (<, >, BETWEEN) in your column order — ranges "break" the prefix chain. • Use EXPLAIN (ANALYZE, BUFFERS) to verify your index is actually being used. Don't assume. • For the "skipped column" pattern above, consider a partial index or a separate single-column index on status. Understanding index internals is the difference between a DBA who just creates indexes, and one who actually reasons about query plans. What's a composite index optimization win you've had on a real project? Drop it below 👇 #PostgreSQL #SQL #DatabaseEngineering #BackendDevelopment #SoftwareEngineering #DataEngineering #TechTips
Like Comment
To view or add a comment, sign in
Tagorenath V
2w
Report this post
PostgreSQL maintains two tiny files alongside every table that affect performance: Visibility Map & Free Space Map. Visibility Map (VM): - 2 bits per page: all_visible, all_frozen - all_visible decides whether a query touches disk - all_frozen decides whether VACUUM does - If all_visible = 1, PostgreSQL skips the heap entirely and serves the query from the index (index-only scan) - If all_visible = 0, it must fetch heap page, check row visibility via xmin/xmax - If all_frozen = 1, it means all rows frozen, VACUUM skips the page entirely (saves I/O) Free Space Map (FSM): - 1 byte per page (0–255) - On INSERT or UPDATE, PostgreSQL reads that byte to find a page with room, jumps directly to that page with available space - No byte, then no shortcut. It would scan the table looking for space The catch is that both maps go stale quickly due to PostgreSQL's MVCC nature: - UPDATE = INSERT + mark old row dead - DELETE = mark row dead, space not reusable yet - In both cases: VM bit cleared, FSM not updated until space is reclaimed VACUUM fixes all of this: - Removes dead tuples - Rewrites FSM bytes to reflect real free space - Sets all_visible = 1 when the page is clean - Sets all_frozen = 1 when all rows are old enough to freeze - Prevents transaction ID wraparound (after ~2B transactions, IDs wrap and frozen rows are immune) So, next time a query gets slow, don't just ask "is the query bad?" or "is the index covered?" Ask: - When did VACUUM last run on this table? - How many dead tuples are accumulating? - Are our index-only scans actually skipping the heap? #PostgreSQL #DatabaseInternals #BackendEngineering #Performance #pgpulse
8 Comments
Like Comment
To view or add a comment, sign in
Philip Johnston, PhD 🌍
2w
Report this post
"next time a query gets slow, don't just ask "is the query bad?" or "is the index covered?" instead, add this to your arsenal:
Tagorenath V

Distributed Systems | Real-Time Data Engineering
2w

PostgreSQL maintains two tiny files alongside every table that affect performance: Visibility Map & Free Space Map. Visibility Map (VM): - 2 bits per page: all_visible, all_frozen - all_visible decides whether a query touches disk - all_frozen decides whether VACUUM does - If all_visible = 1, PostgreSQL skips the heap entirely and serves the query from the index (index-only scan) - If all_visible = 0, it must fetch heap page, check row visibility via xmin/xmax - If all_frozen = 1, it means all rows frozen, VACUUM skips the page entirely (saves I/O) Free Space Map (FSM): - 1 byte per page (0–255) - On INSERT or UPDATE, PostgreSQL reads that byte to find a page with room, jumps directly to that page with available space - No byte, then no shortcut. It would scan the table looking for space The catch is that both maps go stale quickly due to PostgreSQL's MVCC nature: - UPDATE = INSERT + mark old row dead - DELETE = mark row dead, space not reusable yet - In both cases: VM bit cleared, FSM not updated until space is reclaimed VACUUM fixes all of this: - Removes dead tuples - Rewrites FSM bytes to reflect real free space - Sets all_visible = 1 when the page is clean - Sets all_frozen = 1 when all rows are old enough to freeze - Prevents transaction ID wraparound (after ~2B transactions, IDs wrap and frozen rows are immune) So, next time a query gets slow, don't just ask "is the query bad?" or "is the index covered?" Ask: - When did VACUUM last run on this table? - How many dead tuples are accumulating? - Are our index-only scans actually skipping the heap? #PostgreSQL #DatabaseInternals #BackendEngineering #Performance #pgpulse
Like Comment
To view or add a comment, sign in
pgpulse

61 followers
2w
Report this post
This is a great breakdown of how PostgreSQL actually behaves under the hood—especially Visibility Map(VM) & Free Space Map (FSM). What we consistently see in production is this: teams don’t struggle with understanding these concepts—they struggle with observing them in real time. Questions like: Are index-only scans actually skipping heap reads? Is autovacuum keeping up or silently falling behind? How quickly is XID age increasing? By the time performance drops, VM bits are already cleared, FSM is outdated, and bloat has started accumulating. pgpulse is designed to surface these internals—VM, FSM, vacuum behavior, bloat, and more—in real-time, before they become production issues. #pgpulse #supabase #databases #postgres #database #performance
Tagorenath V

Distributed Systems | Real-Time Data Engineering
2w

PostgreSQL maintains two tiny files alongside every table that affect performance: Visibility Map & Free Space Map. Visibility Map (VM): - 2 bits per page: all_visible, all_frozen - all_visible decides whether a query touches disk - all_frozen decides whether VACUUM does - If all_visible = 1, PostgreSQL skips the heap entirely and serves the query from the index (index-only scan) - If all_visible = 0, it must fetch heap page, check row visibility via xmin/xmax - If all_frozen = 1, it means all rows frozen, VACUUM skips the page entirely (saves I/O) Free Space Map (FSM): - 1 byte per page (0–255) - On INSERT or UPDATE, PostgreSQL reads that byte to find a page with room, jumps directly to that page with available space - No byte, then no shortcut. It would scan the table looking for space The catch is that both maps go stale quickly due to PostgreSQL's MVCC nature: - UPDATE = INSERT + mark old row dead - DELETE = mark row dead, space not reusable yet - In both cases: VM bit cleared, FSM not updated until space is reclaimed VACUUM fixes all of this: - Removes dead tuples - Rewrites FSM bytes to reflect real free space - Sets all_visible = 1 when the page is clean - Sets all_frozen = 1 when all rows are old enough to freeze - Prevents transaction ID wraparound (after ~2B transactions, IDs wrap and frozen rows are immune) So, next time a query gets slow, don't just ask "is the query bad?" or "is the index covered?" Ask: - When did VACUUM last run on this table? - How many dead tuples are accumulating? - Are our index-only scans actually skipping the heap? #PostgreSQL #DatabaseInternals #BackendEngineering #Performance #pgpulse
Like Comment
To view or add a comment, sign in
Pavan Kumar Bandaru
2w Edited
Report this post
#postgres users, interesting detailing on what a VACCUM can do for the performance improvements of your databases, I liked the three important questions you need to ask whenever your DB has slow queries 𝗪𝗵𝗲𝗻 𝗱𝗶𝗱 𝗩𝗔𝗖𝗨𝗨𝗠 𝗹𝗮𝘀𝘁 𝗿𝘂𝗻 𝗼𝗻 𝘁𝗵𝗶𝘀 𝘁𝗮𝗯𝗹𝗲? 𝗛𝗼𝘄 𝗺𝗮𝗻𝘆 𝗱𝗲𝗮𝗱 𝘁𝘂𝗽𝗹𝗲𝘀 𝗮𝗿𝗲 𝗮𝗰𝗰𝘂𝗺𝘂𝗹𝗮𝘁𝗶𝗻𝗴? 𝗔𝗿𝗲 𝗼𝘂𝗿 𝗶𝗻𝗱𝗲𝘅-𝗼𝗻𝗹𝘆 𝘀𝗰𝗮𝗻𝘀 𝗮𝗰𝘁𝘂𝗮𝗹𝗹𝘆 𝘀𝗸𝗶𝗽𝗽𝗶𝗻𝗴 𝘁𝗵𝗲 𝗵𝗲𝗮𝗽? #postgres #pgpulse #supabase #neon #railway #DatabaseInternals #BackendEngineering #Performance
Tagorenath V

Distributed Systems | Real-Time Data Engineering
2w

PostgreSQL maintains two tiny files alongside every table that affect performance: Visibility Map & Free Space Map. Visibility Map (VM): - 2 bits per page: all_visible, all_frozen - all_visible decides whether a query touches disk - all_frozen decides whether VACUUM does - If all_visible = 1, PostgreSQL skips the heap entirely and serves the query from the index (index-only scan) - If all_visible = 0, it must fetch heap page, check row visibility via xmin/xmax - If all_frozen = 1, it means all rows frozen, VACUUM skips the page entirely (saves I/O) Free Space Map (FSM): - 1 byte per page (0–255) - On INSERT or UPDATE, PostgreSQL reads that byte to find a page with room, jumps directly to that page with available space - No byte, then no shortcut. It would scan the table looking for space The catch is that both maps go stale quickly due to PostgreSQL's MVCC nature: - UPDATE = INSERT + mark old row dead - DELETE = mark row dead, space not reusable yet - In both cases: VM bit cleared, FSM not updated until space is reclaimed VACUUM fixes all of this: - Removes dead tuples - Rewrites FSM bytes to reflect real free space - Sets all_visible = 1 when the page is clean - Sets all_frozen = 1 when all rows are old enough to freeze - Prevents transaction ID wraparound (after ~2B transactions, IDs wrap and frozen rows are immune) So, next time a query gets slow, don't just ask "is the query bad?" or "is the index covered?" Ask: - When did VACUUM last run on this table? - How many dead tuples are accumulating? - Are our index-only scans actually skipping the heap? #PostgreSQL #DatabaseInternals #BackendEngineering #Performance #pgpulse
Like Comment
To view or add a comment, sign in
Michael Shumilov
2w
Report this post
🚀 SequelPG v0.11.1 is live If you work with PostgreSQL every day, this will feel familiar: You run a query You tweak it You come back to something from yesterday You try to remember what actually worked Most tools treat query history as just a log. I don’t think that’s enough. In this release, I rebuilt Query History from scratch. Now it’s something you actually use: Quickly find past queries Reuse them without rewriting Debug faster with less context switching I also refactored the Database Tools layer. You won’t “see” most of it — but you’ll feel it: More consistency Better performance Stronger foundation for what’s coming next I’m not trying to add more features. I’m trying to reduce friction when working with data. Full release notes: https://lnkd.in/dFmaV_xH If you use PostgreSQL, I’d really value your feedback. #PostgreSQL #DeveloperTools #IndieHacker #BuildInPublic #SwiftUI #DX

SequelPG v0.1.11: database tools, query history, and a catalog-query security pass sequelpg.com
Like Comment
To view or add a comment, sign in
MyDBA.dev

33 followers
3w
Report this post
A PostgreSQL table with 500 million rows doesn't just slow down queries. It slows down everything: VACUUM takes hours, index builds lock the table, and deleting old data generates massive WAL and leaves bloat behind. Table partitioning splits a large table into smaller physical pieces while keeping it as a single logical table. The query planner uses partition pruning to scan only the relevant partitions. Here's what matters in practice: 1. The partition key must appear in your most common WHERE clauses. If 90% of your queries filter on event_timestamp, partition by timestamp. A partition key that queries don't filter on provides zero benefit and only adds planning overhead. Always verify with EXPLAIN before committing to a scheme -- if pruning doesn't activate, the partitioning isn't helping. 2. Partition count matters more than people think. Monthly partitions for 5 years = 60 partitions (reasonable). Daily partitions for 5 years = 1,825 partitions (the planner slows down noticeably). The planner evaluates each partition during query planning. Keep it manageable, or use TimescaleDB which is specifically optimized for high partition counts. 3. The biggest operational win is instant data removal. DROP TABLE on a partition takes milliseconds and generates no WAL. Compare that to DELETE FROM events WHERE event_timestamp < '2024-02-01' on 100 million rows -- that takes an hour, generates massive WAL, and leaves dead tuples for VACUUM to clean up. The biggest gotcha: you can't convert an existing table to a partitioned table in place. You need to create a new partitioned table, migrate data in batches, and swap with a rename in a single transaction. For zero-downtime migrations, add a trigger or logical replication to capture writes during the migration. Automate partition lifecycle with pg_partman or cron. A missing future partition causes INSERT failures. A forgotten old partition wastes storage. Full guide with range, list, hash strategies, migration patterns, and pg_partman setup: https://lnkd.in/eGUj8zXC #PostgreSQL #TablePartitioning #DatabasePerformance #DataEngineering #DevOps #SRE
Like Comment
To view or add a comment, sign in
Waqas Dost
1w
Report this post
I used JSON in PostgreSQL for 6 months before someone told me I was doing it wrong. Here is what nobody explains clearly: JSON and JSONB are not the same type with different names. They are completely different storage mechanisms. JSON stores your data as plain text. Every time you query it, PostgreSQL re-parses the entire string. You cannot put an index on it. On a table with 1 million rows, that query takes 4 seconds. JSONB stores your data as binary. Parsed once on write. Never again. You can put a GIN index on any key inside the blob. Same query on the same 1 million rows takes 4 milliseconds. That is not a typo. 4 seconds vs 4 milliseconds. 1000x faster. One word changed. There is one other difference nobody mentions: JSON keeps duplicate keys and preserves key order. Sounds useful. It is not. Key order in JSON is meaningless. Duplicate keys make your data silently inconsistent. JSONB deduplicates keys automatically. Last value wins. Clean data. Every time. The only reason to use JSON over JSONB is if you genuinely need to preserve exact whitespace and key ordering in your output. In 3 years of PostgreSQL work, I have needed that exactly zero times. Default to JSONB. Add a GIN index on the columns you query. Watch your slow JSON queries disappear. #JSON #JSONB #DatabaseOptimization #QueryOptimization #PerformanceTuning #DataModeling #PostgreSQL #Database #DataEngineering #Backend #JSON #GIS
Like Comment
To view or add a comment, sign in

2,369 followers

99 Posts

View Profile Follow

Optimizing PostgreSQL Queries from 8s to 180ms

More Relevant Posts

Explore content categories