🚫 Stop Using Integer Sort Orders in Your Database

🚫 Stop Using Integer Sort Orders in Your Database

Here’s what modern systems use instead — and why it matters more than you think.


The Problem Everyone Ignores

You have a list in your database: Tasks, spreadsheet rows, extracted document fields — each with a sort_order: 1, 2, 3, 4, 5.

Now a user drags item #5 to position #2.

👉 You update every row from position 2 onward.

  • 5 rows → 4 updates
  • 10,000 rows → 9,999 updates

All because of ONE move.

This creates: ❌ Write amplification ❌ Lock contention under concurrent edits ❌ Noisy audit logs (updates instead of intent)

This pattern quietly breaks down at scale.


Three Ways to Store Ordered Data

Let’s break it down:

1. Integer sort_order

  • ❌ Insert/reorder in middle → O(n) writes
  • ❌ Poor concurrency characteristics (row conflicts)
  • ❌ Expensive for large, frequently edited lists
  • ✅ Simple and easy to reason about

👉 Works fine for small or mostly static datasets.


2. Sparse integers (e.g., 1000, 2000, 3000)

  • ⚠️ Amortized O(1) inserts (while gaps exist)
  • ⚠️ Eventually requires rebalancing
  • ⚠️ Partial improvement for concurrency
  • ✅ Simple upgrade over naive integers

👉 Often “good enough” when reordering is infrequent.


3. Fractional Indexing (modern approach)

  • ✅ Amortized O(1) inserts
  • ✅ Single-row writes for reordering
  • ✅ Strong concurrency characteristics
  • ✅ No periodic reindexing in normal operation

👉 Designed for dynamic, user-reordered, multi-user systems.


What Is Fractional Indexing?

Instead of integers, assign lexicographically sortable string keys.

Example:

Before: A ("a0") B ("a1") C ("a2")

Insert X between A and B: X gets → "a0V"

After: A ("a0") X ("a0V") B ("a1") C ("a2")

👉 Only one row is inserted/updated 👉 Ordering is handled via string comparison


How It Works (Intuition)

  1. Use an ordered alphabet (commonly base-62: 0–9, A–Z, a–z)
  2. Generate a key between two existing keys
  3. If no space exists between them → extend key length
  4. Growth is localized — only dense regions expand

👉 In practice, keys remain short for most workloads.


Complexity (Accurate View)

  • Inserts: Amortized O(1)
  • Worst case: O(k) where k = key length
  • Reads: O(log n) (via indexed sort)

👉 Crucially: avoids O(n) rewrite cascades.


Concurrency Behavior

Fractional indexing is highly concurrency-friendly, but not magic:

  • Two clients inserting at the same position may generate the same key
  • Collisions are rare and handled via retry or tie-breaking
  • No large-scale row conflicts like integer shifting

👉 Net effect: dramatically fewer conflicts in collaborative systems.


Why This Matters (Especially for IDP)

If you’re building document processing or workflow systems:

Think:

  • Invoice line items
  • Contract clauses
  • Extracted fields with human corrections
  • Template + runtime data merging

With integer ordering: ❌ Many rows rewritten per edit ❌ Conflicts during concurrent changes ❌ Audit logs filled with mechanical updates

With fractional indexing: ✅ Single-row inserts ✅ Minimal conflict surface ✅ Cleaner, intent-driven audit trails


Recommended Schema

id            → UUID  
document_id   → UUID  
sort_key      → VARCHAR(64)  
field_name    → VARCHAR  
field_value   → TEXT  
confidence    → FLOAT  
        

Add an index on:

(document_id, sort_key)
        

👉 Efficient ordered reads + localized writes.


Getting Started

import { generateKeyBetween } from 'fractional-indexing';

const first  = generateKeyBetween(null, null);    // "a0"
const second = generateKeyBetween(first, null);   // "a1"
const middle = generateKeyBetween(first, second); // "a0V"
        

👉 Libraries handle edge cases like key growth and collisions.


Trade-offs (Don’t Skip This)

  • Keys grow in length in densely edited regions
  • Slightly more complex mental model than integers
  • Rare collisions require retry logic
  • Debugging order is less intuitive than numeric fields

👉 Trade complexity for scalability and correctness.


Key Takeaways

  • Integer ordering causes O(n) write amplification
  • Sparse integers delay the problem but don’t eliminate it
  • Fractional indexing provides amortized O(1) inserts
  • Well-suited for collaborative and reorder-heavy systems
  • Easy to implement, hard to outgrow


Next time you reach for sort_order INTEGER, pause.

Ask yourself: 👉 “Will this list ever be reordered frequently or concurrently?”

If yes — You already know the better approach.

To view or add a comment, sign in

More articles by Aniket Tatipamula

Others also viewed

Explore content categories