Postgres Table Storage: Pages and Tuples

A Postgres table is not a list of rows. It's a collection of 8KB pages and rows are just byte sequences packed inside them. Most engineers think of a table as an ordered structure - insert a row, it goes to the end. Postgres thinks in pages. Every table is a heap file, split into fixed 8KB blocks. Rows go wherever there's space. Every page consists of following data: 1. A page header - metadata about the page itself 2. Tuples - actual rows, packed from the bottom up 3. Item pointers - fixed slots at the top, each pointing to a tuple's location within the page 4. Every tuple carries its own header - which transaction inserted it, which deleted it, whether it's currently visible. When a page fills up, Postgres moves to the next one. No global directory tracks which row lives on which page and the heap is intentionally unordered. This is why a sequential scan reads every page regardless of how many rows match. Without an index, there's no way to skip pages. A table with heavy updates and deletes accumulates dead tuples. Old row versions are still sitting in pages, occupying space, getting scanned on every read. Physical layout of how rows are stored in Postgres directly affects the query cost. Storage internals almost never appear in Postgres tutorials. When was the first time a dead tuple problem you actually faced? #PostgreSQL #DatabaseEngineering #BackendEngineering

  • graphical user interface

To view or add a comment, sign in

Explore content categories