Database Indexing: Boost Query Speed with B-Tree, Hash, GIN, and Composite Indexes

📊 VOCÊ SABE O QUE É DATABASE INDEXING? Você query lento? INDEX resolve. 📊 **Indexing:** ```sql -- Sem index EXPLAIN SELECT * FROM users WHERE email = 'test@example.com'; -- Seq Scan (lento) -- Com index CREATE INDEX idx_users_email ON users(email); EXPLAIN SELECT * FROM users WHERE email = 'test@example.com'; -- Index Scan (rápido!) ``` Types: ✅ B-Tree (padrão, range queries) ✅ Hash (exact match) ✅ GIN (full-text, JSON) ✅ Composite (múltiplas colunas) ```sql -- Composite index CREATE INDEX idx_orders_user_date ON orders(user_id, created_at); -- Partial index CREATE INDEX idx_active_users ON users(id) WHERE active = true; ``` E resultado? 🎯 Queries 100x faster 🎯 Less CPU 🎯 Happy users ⚠️ Cuidado: - Index overhead em writes - Storage extra - Choose wisely --- Me segue para mais dicas! E aproveita os CUPONS pra joinar a gente: 🔗 https://devopsforlife.io NINJA - 20% OFF: https://lnkd.in/dchtzbWH JEDI - 20% OFF: https://lnkd.in/d9G9R-Ew SUPER SAIYAN - 20% OFF: https://lnkd.in/dtm2Hnj6 --- #devops #database #indexing #sql #performance #postgresql #devopsforlife

To view or add a comment, sign in

More Relevant Posts

Mohit Bodkhe
3w
Report this post
🚀 Database Indexing (Part 1): The Foundation of Fast Queries Before scaling systems with partitioning or distributed caching, the first step is Database Indexing. If your queries are slow, you’re likely missing the right indexes. 🔹 What is Database Indexing? Database Indexing is a technique used to improve query performance by creating a structure that allows faster data lookup. 👉 Like a book index — jump directly to the data instead of scanning everything. 🔹 How It Works Without Index ❌ ➡ Full Table Scan (O(n)) With Index ✅ ➡ Faster Lookup (O(log n)) 🔹 Types of Indexes 1️⃣ B-Tree Index (Most Common) Default index in most databases Supports: Equality (=) Range (>, <, BETWEEN) Sorting 2️⃣ Hash Index Best for exact match (=) Very fast lookup 👉 Limitation: ❌ No range queries ❌ No sorting 3️⃣ Composite Index Multiple columns Example: (user_id, created_at) 👉 Follows left-to-right rule 4️⃣ Unique Index Ensures no duplicate values Example: email, username 5️⃣ Full-Text Index Used for search functionality Example: product search, keyword search 🔹 Benefits ✅ Faster query execution ✅ Efficient searching ✅ Reduced full table scans ✅ Better performance for large datasets 💬 In Part 2, I’ll cover real-world problems, trade-offs, and best practices. #Database #BackendDevelopment #Java #SQL #Performance #Optimization
Like Comment
To view or add a comment, sign in
Mani Teja
3w
Report this post
I reduced an API response from ~3.8s to ~40ms without changing application code. The fix? One composite index. Here's the indexing strategy most backend developers skip: 𝗪𝗵𝗮𝘁 𝗮𝗰𝘁𝘂𝗮𝗹𝗹𝘆 𝘀𝗹𝗼𝘄𝘀 𝘆𝗼𝘂𝗿 𝗗𝗕: → Full table scans on WHERE clauses → Missing indexes on JOIN columns → Selecting * instead of specific columns → N+1 queries masquerading as "features" 𝗧𝗵𝗲 𝗰𝗼𝗺𝗽𝗼𝘀𝗶𝘁𝗲 𝗶𝗻𝗱𝗲𝘅 𝗿𝘂𝗹𝗲 𝗻𝗼𝗯𝗼𝗱𝘆 𝘁𝗲𝗮𝗰𝗵𝗲𝘀: Index column order matters. Put the most selective column first (highest cardinality) AND match the index order with your WHERE clause. Example: WHERE user_id = ? AND status = ? → index on (user_id, status) NOT (status, user_id) Bonus: If your query has ORDER BY, include it in the index to avoid an extra sort step. 𝗪𝗵𝗲𝗻 𝗡𝗢𝗧 𝘁𝗼 𝗶𝗻𝗱𝗲𝘅: → Small tables (often <10k rows) → Write-heavy tables → Low cardinality columns (boolean, status with few values) 𝗛𝗲𝗿𝗲’𝘀 𝗲𝘅𝗮𝗰𝘁𝗹𝘆 𝗵𝗼𝘄 𝗜 𝗱𝗲𝗯𝘂𝗴𝗴𝗲𝗱 𝗶𝘁: 1. Run EXPLAIN ANALYZE on your slowest queries 2. Look for "Seq Scan" or high "Rows Removed by Filter" 3. Add indexes strategically — not blindly Performance is free. You just have to know where to look. #Backend #Database #PostgreSQL #Performance #SQL #SoftwareEngineering
2 Comments
Like Comment
To view or add a comment, sign in
Abhinav Sinha
3w
Report this post
Why you can't "just" concurrently index partitioned tables in Postgres We’ve been working on partitioning some of our fastest-growing tables recently. Since these are high-traffic, write-heavy tables, we tried to add indexes concurrently to avoid downtime. That's when we hit a classic wall: ERROR: cannot create index on partitioned table "..." concurrently. The "Why": CREATE INDEX CONCURRENTLY is designed to avoid locking your table. But since a partitioned table is a virtual parent, Postgres can’t coordinate a "no-lock" build across multiple child tables at once from a single command. It defaults to an Access Exclusive Lock, which, in a production environment, is a dealbreaker for performance. The "Create-and-Link" Pattern we used instead: • The Shell: Run CREATE INDEX ... ON ONLY parent_table. This adds the metadata to the parent but keeps it Invalid at first. • The Background Build: Run CREATE INDEX CONCURRENTLY on each individual partition. This builds the index without blocking our production traffic. • The Link: Attach each partition index to the parent: ALTER INDEX parent_idx ATTACH PARTITION child_idx. The Result: Once the final partition is linked, the parent index automatically becomes Valid. We got zero downtime, and any future partitions will now inherit this index automatically. It's a bit more manual work, but it's a lifesaver for keeping a high-growth system responsive. #PostgreSQL #DatabaseEngineering #SystemDesign #Backend #Scalability #SDE

2 Comments
Like Comment
To view or add a comment, sign in
OpenSource DB

1,892 followers
3w
Report this post
A question for every dev who's ever designed a database table: Did you design it for 10,000 rows or 10 million? Because when schemas are designed for the demo. For the MVP. For "let's just ship it and optimize later." And "later" arrives as a 3 AM P1 page, 18 months down the road, when the table that "works fine" has grown 1000x and suddenly nothing works fine. We call these the haunting patterns. Schema decisions that feel harmless at small scale and become structural nightmares at large scale. Data Drop #6 covers the big three: → UUIDs as primary keys — Random values fragment your B-tree indexes. At 500M rows, your index is bloated, your writes scatter across random pages, and your cache hit ratio craters. Sequential IDs exist for a reason. → "Just make it nullable" — The path of least resistance at design time. The source of a thousand bugs at query time. NULL doesn't equal NULL. Your aggregations silently skip rows. Your joins produce unexpected results. Nullable should be a conscious choice, not a default. → The EAV trap — Entity-Attribute-Value: the schema pattern that says "I don't want to commit to a data model." Congratulations, you now have a key-value store with the performance of a relational database and the flexibility of neither. Design for the table size you're going to have. Not the one you have today. Data Drop #6. Day 6 of 23. #AprilDataDrops #PostgreSQL #DataDrop6 #SchemaDesign #Database #Performance #OpenSourceDB OpenSource DB | Lahari Giddi
Like Comment
To view or add a comment, sign in
Syed Syab Ahmad S.
1w
Report this post
Your backend feels slow. You blame the API. But the real problem? Your database. Common mistakes I see (especially with PostgreSQL): → Missing indexes on frequently filtered columns → SELECT * everywhere (pulling unnecessary data) → N+1 queries killing performance → No pagination (loading thousands of rows) → Ignoring slow query logs Result: ✘ High latency ✘ Server overload ✘ Poor user experience What actually fixes it: 1. Add indexes where filters & joins happen 2. Use EXPLAIN ANALYZE (not guessing) 3. Select only required columns 4. Batch queries / avoid N+1 5. Enable slow query logging Rule: “If you didn’t measure your query, you didn’t optimize it.” Most backend issues are not backend issues. They’re database problems in disguise. #PostgreSQL #Backend #Performance #Database #SoftwareEngineering #Mentee #follow #followformore
Like Comment
To view or add a comment, sign in
Manash Dholey
2w Edited
Report this post
🚀 12 Rules for High-Performance SQL Stored Procedures When it comes to backend engineering, database bottlenecks can be considered "silent killers" of your application's performance. After years of evaluating execution plans, I’ve identified these twelve optimization strategies as having the most significant impact on improving performance. The basics: 1. SET NOCOUNT ON: Prevent unnecessary "rows affected" messages from communicating on the network. 2. Specify Columns: Never SELECT *, only retrieve the columns you actually need to minimize I/O. 3. Schema Qualification: Use [dbo].[Table]. This eliminates the need for the engine to look through all the schemas during compilation. 4. IF EXISTS > COUNT(): Do not scan the entire table to find out whether or not there is a record. The Architecture Level: 5. Write SARGable Queries: Use WHERE clauses that can make use of an index on the referenced column. Do not create functions on that column name. For example, instead of using the function YEAR(Date) = 2024; you should write the same logic as Date >= '2024-01-01'. 6. Lean Transactions: The longer a transaction is, the more likely you will run into deadlocks or blocks. 7. Prefer UNION ALL to UNION: Do not use the expensive internal Sort/Distinct unless you have to have unique rows. 8. Avoid Scalar Functions: They are just like loops; use Inline Table-Valued Functions instead of scalars to allow for a better execution plan. Pro-Level Tuning: 9. Table Vars vs. Temp Tables: Use @Table for small datasets ($<1000$ rows). They lead to fewer recompiles but lack statistics (the optimizer assumes 1 row). Use #Temp for large datasets or complex joins. They support full statistics and indexing, allowing the engine to generate an accurate execution plan. 10. Manage Parameter Sniffing: Use local variables to prevent the engine from locking into a sub-optimal plan based on one specific input. 11. Set-Based Logic: Ditch the Cursors. SQL is built for sets, not row-by-row looping.. 12. Never use dynamic SQL: it presents significant security vulnerabilities and will also reduce the ability for an execution plan to be reused. #SQLServer #DatabaseOptimization #BackendEngineering #DotNet #CleanCode #ProgrammingTips #SoftwareArchitecture
Like Comment
To view or add a comment, sign in
Abhinav Anand
3w Edited
Report this post
My updates on DB(SQL) Learning this week! 'Foreign key' is a column in one table which relate to another table's 'primary key'. Managing the data using CASCADE(deletion upto connected references), SET NULL(soft deletion) and RESTRICT(no deletion at all). Learned and understood INNER , LEFT and FULL OUTER JOIN how they are useful based on the requirements. Then I come to know about 'EXPLAIN ANALYZE' diagnostic tool shows realtime statistics! further deep dive into INDEXING how we create index and how its reduce the query execution time to 'B-Tree , O(log n)' and something 'non-key value index' which store the value on its leaf level to reduce the lookup ! Next, I learnt about the Transactions which include Begin , Update , Commit or Rollback and got to know about the 'dirty read' is basically showing the values before commit which is not good! PostgreSQL don't have 'dirty read'. Finally Learnt ACID compliance in databases which has 4 things , ATOMICITY means its neither half or fraction, it will be full else Rollback ie, transactions must be fully commit, then we have CONSISTENCY which mean transaction brings form one valid state to another. ISOLATION , concurrent transactions don't interface with each other and last is DURABILITY which means once transaction is Committed it remains permanently recored even in the event of system crash or power outage!!!! #PostgreSQL #SQL #Database
Like Comment
To view or add a comment, sign in
Tanmay Bhurkunde
2w
Report this post
I used to think indexes are just something you “add later” to optimize queries. Turns out… that thinking is exactly why queries become slow. Recently, I came across this amazing resource: 🔗 https://lnkd.in/gh4X-Fwj Here’s what changed my understanding: 💡 1. Indexing is NOT a DBA task — it’s a developer responsibility If you don’t design queries with indexes in mind, performance issues are inevitable. 💡 2. B-Tree structure actually explains everything Understanding how indexes are stored (tree + linked structure) made it clear why some queries are fast and others are painfully slow. 💡 3. WHERE clause decides index usage Even a small mistake like using functions or wrong column order can completely bypass indexes. 💡 4. More indexes ≠ better performance Over-indexing can actually slow down writes and increase complexity. 📌 My biggest takeaway: “SQL performance is not about writing queries that work — it’s about writing queries that scale.” If you're learning SQL or preparing for backend/data engineering roles, this resource is gold. #SQL #DataEngineering #BackendDevelopment #SystemDesign #LearningInPublic

SQL Indexing and Tuning e-Book for developers: Use The Index, Luke covers Oracle, MySQL, PostgreSQL, SQL Server, ... use-the-index-luke.com
Like Comment
To view or add a comment, sign in
V MuraliDharan
1w
Report this post
Is COUNT(*) really a BAD IDEA for existence checks in SQL? I didn’t just believe it - I tested it as below, Dataset: ~20,000 records Indexed column: user_email Queries Tested: -- 1. COUNT(*) SELECT COUNT(*) FROM users WHERE user_email = 'user830@example.com'; -- 2. EXISTS SELECT EXISTS ( SELECT 1 FROM users WHERE user_email = 'user830@example.com' ); -- 3. LIMIT 1 SELECT 1 FROM users WHERE user_email = 'user830@example.com' LIMIT 1; EXPLAIN Output (Screenshot Attached) All queries show: type: const rows: 1 Using index Meaning: All 3 queries are optimized and fast 🤔 So… is COUNT(*) really bad? Not always. With proper indexing: Even COUNT(*) performs efficiently No full table scan Direct index lookup happens ✅ Real-World Takeaway ✔ Use EXISTS → for correct intent (true/false) ✔ Use LIMIT 1 → for simple & fast API checks ✔ Use COUNT(*) → when you actually need the count 🔥 The Real Lesson ❌ Problem is NOT COUNT(*) ✅ Problem is missing indexes 💬 Final Thought 👉 “First optimize your indexing… then worry about query patterns.” 📸 Sharing my real EXPLAIN output below 👇 Have you tested this in your system? #SQL #MySQL #DatabaseOptimization #MuraliCodes #BackendDevelopment #PerformanceTuning #Developers #LearningInPublic
1 Comment
Like Comment
To view or add a comment, sign in
Fabricio Galdino
3w
Report this post
A SQL query was freezing our system — and the root cause surprised us Two years ago, I faced a production issue that I never forgot. A critical query, used by clients to check their history, was freezing the entire system. The table had more than 2 million rows. At first, the assumption was simple: “We need to upgrade the database infrastructure.” But even after that, the problem persisted. So I sat down and started analyzing the query in detail. After some time debugging, I found the issue: The query had an ORDER BY DESC. Even with indexes and date filters (BETWEEN), this forced the database to sort a huge dataset, causing the slowdown. We removed the ORDER BY. And suddenly, the query became fast again and the system stopped freezing. --- What I learned That experience completely changed how I look at SQL performance. Sometimes: - It is not about infrastructure - It is not about scaling - It is about understanding what the database is really doing --- So I built a project to study this I recreated similar scenarios to measure the impact of: - Indexing strategies - Query structure - Filtering patterns Results: - Full table scan vs indexed query: 21.36ms → 5.10ms (76.1%) - LIKE wildcard vs exact match: 37.06ms → 1.99ms (94.6%) - JOIN slow vs optimized: 19.12ms → 14.56ms (23.8%) --- Project: https://lnkd.in/dCn6VMYq --- Final takeaway Sometimes the biggest performance issue is just a small detail in your query.

GitHub - GaldinoFabricio/sql-performance-lab github.com
Like Comment
To view or add a comment, sign in

5,002 followers

View Profile Connect

Database Indexing: Boost Query Speed with B-Tree, Hash, GIN, and Composite Indexes

More from this author

Paperclip e a Era das Empresas Autônomas: Gerenciando Agentes como Funcionários

OpenClaw e a nova era da automação

O Preço do Código e a Revolução dos Agentes (Bastidores sobre o módulo OpenCode)

Explore content categories