Name: SQL Indexing: From Sequential Scans to Intelligent Navigation | Ankit Barik posted on the topic | LinkedIn
Uploaded: 2026-04-13T10:53:35.480Z
Duration: 27 s
Channel: Ankit Barik

Ankit Barik

3w Edited

🚀 SQL Indexing: From “it makes queries faster” to actually understanding why For a long time, I used to hear: “Just add an index — your query will be faster.” But I never really understood what actually changes under the hood. Recently, I explored this using EXPLAIN ANALYZE — and the difference was eye-opening. 🧠 Before indexing SELECT * FROM marks WHERE name = 'Chinu'; Execution plan: ➡️ Parallel Sequential Scan - The database scans the entire table - Checks every row - Cost grows linearly with data size ⏱️ Higher execution time as data increases ⚡ After adding an index CREATE INDEX idx_name ON marks(name); Execution plan: ➡️ Index Scan - Uses a B-Tree structure internally - Navigates like a search tree (O(log n)) - Directly jumps to matching rows ⏱️ Significant performance improvement 🔍 Going one step further — Covering Index CREATE INDEX idx_name ON marks(name) INCLUDE (marks); Now for this query: SELECT name, marks FROM marks WHERE name = 'Chinu'; ➡️ Index Only Scan - Required data is already present inside the index - No need to access the main table (heap) - Eliminates extra lookups 💡 What actually changed? - The data didn’t change. - The query didn’t change. 👉 The data access strategy changed. ❌ Sequential Scan → “Check everything” ✅ Index Scan → “Navigate intelligently” 🚀 Index Only Scan → “Don’t even touch the table” ⚠️ Trade-offs Indexes are powerful, but not free: - Additional storage overhead - Slower INSERT / UPDATE operations - Must be designed based on query patterns 📌 Final thought “Indexes don’t just make queries faster — they change how databases think about data access.” Exploring more around execution plans, query optimization, and database internals. #SQL #BackendDevelopment #Database #Performance #LearningInPublic #Developers

To view or add a comment, sign in

More Relevant Posts

Navya Pappula
2d
Report this post
✨ Day 62 – Subqueries & Logical Operators in SQL Continuing my journey with SQL, today I explored how to write smarter and more dynamic queries using subqueries and logical operators 📊 🔹 Subqueries (Nested Queries) A subquery is a query inside another query. It helps break complex problems into smaller, manageable parts. ✔ Used inside SELECT, WHERE, or FROM clauses ✔ Can return single or multiple values ✔ Makes queries more dynamic and powerful 👉 Common use cases: • Filtering data based on another query • Comparing values within the same table • Creating temporary result sets 🔹 Logical Operators Logical operators are used to combine multiple conditions in a query. ✔ AND – All conditions must be true ✔ OR – At least one condition must be true ✔ NOT – Reverses a condition 👉 Why they matter: • Help refine and filter data precisely • Allow complex decision-making in queries • Improve accuracy of results 🔹 Other Useful Operators ✔ IN – Matches any value in a list or subquery ✔ BETWEEN – Filters within a range ✔ LIKE – Pattern matching ✔ EXISTS – Checks if a subquery returns data 🔹 Why These Concepts Matter? ✔ Handle complex data retrieval scenarios ✔ Build flexible and efficient queries ✔ Essential for real-world data analysis and reporting 📌 Takeaway: Subqueries and logical operators make SQL smarter. They allow you to filter, compare, and analyze data in a more powerful and structured way. #SQL #DataAnalytics #LearningJourney #Databases #TechSkills #FrontlineMedia
Like Comment
To view or add a comment, sign in
Mohamad Ali Kardi
1w
Report this post
🚨 Database Series #25: Query Optimization Techniques Your query works… but why is it slow only in production? Same query. Same database structure. Different performance. That’s where real SQL engineering begins in Microsoft SQL Server. 🔍 Core Concept Query optimization isn’t just rewriting SQL. It’s understanding how SQL Server executes your query. Three advanced concepts you must know: ⚙️ SARGability (Search ARGument Ability) ✔ SARGable → Uses indexes efficiently ❌ Non-SARGable → Forces full scans Example: ❌ Bad: WHERE YEAR(OrderDate) = 2025 ✔ Good: WHERE OrderDate >= '2025-01-01' AND OrderDate < '2026-01-01' 👉 Same result, completely different performance. ⚙️ Parameter Sniffing SQL Server caches execution plans. First execution = plan is generated Next executions = same plan reused ✔ Good when data is consistent ⚠️ Bad when data distribution varies Result: ❌ Fast sometimes ❌ Slow other times ⚙️ Temp Tables vs Table Variables ✔ Temp Tables (#Temp) → Better for large datasets → Statistics available → Optimizer makes better decisions ✔ Table Variables (@Table) → Lightweight → Good for small data → Limited statistics 🧩 Visual Diagram Query Optimization │ ├── SARGable → Index Seek ⚡ │ ├── Parameter Sniffing → Plan reuse ⚠️ │ └── Temp vs Table Variable ├── Temp → Large data ✔ └── Variable → Small data ✔ ⚠️ Common Mistake Focusing only on query syntax Ignoring: ❌ Execution plans ❌ Data distribution ❌ Index usage 🚨 SQL performance is not about “what you write” It’s about how it executes 🧠 Practical Takeaway ✔ Write SARGable queries ✔ Be aware of plan caching behavior ✔ Choose the right temporary structure Golden rule: ➡️ Think like the optimizer, not just a developer. 💬 Let’s Discuss What caused you the most trouble in production? A) Bad indexing B) Parameter sniffing C) Non-SARGable queries 🔜 Next in the series: SQL Server Security — Logins, Users, Roles & Permissions
Like Comment
To view or add a comment, sign in
Chodisetti Jyotsna
2w
Report this post
Day 4 — Going Beyond Basic SQL 🚀 Most beginners stop at "SELECT *". Today, I pushed past that and explored how databases actually work behind the scenes in real-world systems. Here’s what I learned: 1️⃣ Handling Files in Databases Discovered how SQL can store binary data using "VARBINARY(MAX)". But in production, storing file paths is often preferred for better performance and scalability. 2️⃣ Boolean Logic in Databases Learned that systems like SQLite use integers (1/0) for TRUE/FALSE. This powers real-world features like: • user activation • verification status • feature toggles 3️⃣ Industry-Level Data Types Explored specialized types like: • XML → for structured configuration data • GEOMETRY / Spatial → used in maps, logistics, and GIS systems These are widely used in enterprise applications. 4️⃣ Creating Tables from Existing Data CREATE TABLE SubTable AS SELECT CustomerID, CustomerName FROM Customer; Simple, but powerful for: • backups • reporting • migrations • testing 5️⃣ Schema Evolution with ALTER TABLE Practiced modifying table structures: • adding/resizing columns • dropping columns This is a key part of database maintenance. 6️⃣ Understanding Data Deletion Knowing the difference matters: • "DELETE" → removes selected rows • "TRUNCATE" → clears all rows, keeps structure • "DROP" → removes the table entirely 💡 SQL isn’t just about querying data — it’s about designing, maintaining, and scaling systems. Every day, I’m moving from writing queries → understanding how data powers real applications. On to Day 5. #SQL #Databases #BackendDevelopment #DataEngineering #SoftwareEngineering #LearningInPublic
Like Comment
To view or add a comment, sign in
Amit Kumar Mishra
2w
Report this post
𝗪𝗿𝗶𝘁𝗶𝗻𝗴 𝗮𝗻 𝗦𝗤𝗟 𝗾𝘂𝗲𝗿𝘆 𝗶𝘀 𝗲𝗮𝘀𝘆. 𝗪𝗿𝗶𝘁𝗶𝗻𝗴 𝗮 𝗳𝗮𝘀𝘁 𝗦𝗤𝗟 𝗾𝘂𝗲𝗿𝘆 𝗶𝘀 𝗮 𝗱𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝘁 𝘀𝗸𝗶𝗹𝗹. When working with small datasets, almost any query works. But in real-world databases with millions of rows, poorly written queries can become slow and expensive. Here are 5 practical tips to optimize SQL queries 👇 1️⃣ Use Indexes on frequently filtered columns Indexes help databases find data faster. Example: CREATE INDEX idx_customer_id ON orders(customer_id); Columns used in WHERE, JOIN, or ORDER BY are great candidates for indexing. 2️⃣ Avoid SELECT * Fetching all columns may seem convenient, but it increases memory usage and query time. Better approach: SELECT id, name, amount FROM orders; Only select the columns you actually need. 3️⃣ Prefer JOINs over nested subqueries In many cases, JOINs are more efficient and easier to optimize. Example: SELECT customers.name, SUM(orders.amount) AS total_spent FROM customers JOIN orders ON customers.id = orders.customer_id GROUP BY customers.name; 4️⃣ Filter data as early as possible Applying filters early reduces the number of rows processed. Example: SELECT * FROM sales WHERE region = 'East' GROUP BY product; This ensures only relevant rows are processed. 5️⃣ Avoid leading wildcards in LIKE This query is slow: WHERE name LIKE '%John%' Better: WHERE name LIKE 'John%' This allows indexes to work efficiently. 💡 Key takeaway Small improvements in your SQL queries can lead to huge performance gains, especially when working with large datasets. Curious to know 👇 What’s one SQL optimization trick you’ve learned recently? #SQL #DataAnalytics #SQLTips #LearningInPublic #DataAnalyticsJourney
2 Comments
Like Comment
To view or add a comment, sign in
Rajeev Kumar
2w
Report this post
𝗗𝗮𝘆 1/30: 𝗦𝗤𝗟 𝗙𝘂𝗻𝗱𝗮𝗺𝗲𝗻𝘁𝗮𝗹𝘀🔥:𝗦𝘁𝗮𝗿𝘁𝗶𝗻𝗴 𝗪𝗶𝘁𝗵 𝗕𝗮𝘀𝗶𝗰𝘀 - Before writing complex queries, you need to understand how SQL is structured. These 5 command types are the base of everything. 1️⃣ 𝗗𝗗𝗟 (𝗗𝗮𝘁𝗮 𝗗𝗲𝗳𝗶𝗻𝗶𝘁𝗶𝗼𝗻 𝗟𝗮𝗻𝗴𝘂𝗮𝗴𝗲) 👉 Used to define and manage database structure. • CREATE – Create database objects (DATABASE, TABLE, INDEX, VIEW) • ALTER – Modify structure (ADD, MODIFY, DROP COLUMN) • DROP – Delete database objects (TABLE, DATABASE) • TRUNCATE – Remove all records from a table (no condition) • RENAME – Rename database objects • Constraints – Rules on data (PRIMARY KEY, FOREIGN KEY, UNIQUE, NOT NULL, CHECK, DEFAULT) 2️⃣ 𝗗𝗠𝗟 (𝗗𝗮𝘁𝗮 𝗠𝗮𝗻𝗶𝗽𝘂𝗹𝗮𝘁𝗶𝗼𝗻 𝗟𝗮𝗻𝗴𝘂𝗮𝗴𝗲) 👉 Used to insert, update, and delete data. • INSERT – Add new records (single/bulk) • UPDATE – Modify existing records (with conditions) • DELETE – Remove records (specific or all) 3️⃣ 𝗗𝗤𝗟 (𝗗𝗮𝘁𝗮 𝗤𝘂𝗲𝗿𝘆 𝗟𝗮𝗻𝗴𝘂𝗮𝗴𝗲) 👉 Used to retrieve data from database. • SELECT – Fetch data (WHERE, DISTINCT, ORDER BY, GROUP BY, HAVING, LIMIT/TOP) 4️⃣ 𝗗𝗖𝗟 (𝗗𝗮𝘁𝗮 𝗖𝗼𝗻𝘁𝗿𝗼𝗹 𝗟𝗮𝗻𝗴𝘂𝗮𝗴𝗲) 👉 Used to control access and permissions. • GRANT – Provide access to users • REVOKE – Remove access from users 5️⃣ 𝗧𝗖𝗟 (𝗧𝗿𝗮𝗻𝘀𝗮𝗰𝘁𝗶𝗼𝗻 𝗖𝗼𝗻𝘁𝗿𝗼𝗹 𝗟𝗮𝗻𝗴𝘂𝗮𝗴𝗲) 👉 Used to manage transactions in database. • COMMIT – Save changes permanently • ROLLBACK – Undo changes • SAVEPOINT – Set point for partial rollback 💡𝗦𝗤𝗟 𝗶𝘀 𝗻𝗼𝘁 𝗷𝘂𝘀𝘁 𝗾𝘂𝗲𝗿𝘆𝗶𝗻𝗴. 𝗜𝘁’𝘀 𝘀𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲 + 𝗰𝗼𝗻𝘁𝗿𝗼𝗹 + 𝗿𝗲𝗹𝗶𝗮𝗯𝗶𝗹𝗶𝘁𝘆. Follow for Day 2 🚀 #SQL #DataEngineering #LearnSQL #Database #Analytics #Tech #DataAnalytics
2 Comments
Like Comment
To view or add a comment, sign in
Bogdan Topalov
1w
Report this post
In today's SQL lesson, we answer: "Which customers spent above average this month?" Simple question on the surface. But answering it requires three separate things: - a monthly total per customer - the average across all customers - and a comparison between the two. Usually this is a subquery inside a subquery inside a subquery that works but that nobody, including the author, can read three days later. CTEs are the fix. Breaking it down: • WITH … AS gives a subquery a name. That name becomes a temporary table you can reference anywhere below in the same query. Nothing is stored permanently. It's computed on the fly, exists only for the duration of the query and disappears when it's done. • Chaining CTEs is what makes this powerful. You can chain as many as you need, each one adding a layer of logic on top of the last. • WHERE total > avg_total is the final filter. By the time SQL reaches this line, the heavy lifting is already done in the CTEs. The comparison is clean and readable. Where CTEs earn their place: • Multi-step logic → name each stage, build on it sequentially • Reuse within a query → reference the same CTE multiple times, compute once • Debugging → run each block in isolation to inspect results • Replacing views → avoid a permanent view for one-time complex logic • Code reviews → colleagues can read, extend and maintain it When should you still use a subquery? CTEs aren't always the answer. For a simple, single-use inline lookup, filtering by a subquery result makes it shorter and perfectly readable 👇. | WHERE amount > (SELECT AVG(amount) FROM sales) | Why does this matter beyond SQL? CTEs are a design decision. A query written in CTEs is a query that can be handed to another analyst and understood in five minutes. A deeply nested query is a puzzle that needs the original author in the room. As data teams grow, readability becomes a requirement. A CTE is how SQL that you’ve written survives beyond you. 🔖 Save this if you work with data. ✅ Follow me for more practical SQL, data engineering tips and automation breakdowns for teams that run on data.
Like Comment
To view or add a comment, sign in
Zain Ul Abideen
3w
Report this post
💡 What *really* happens when you run an SQL query? Let’s break it down with a simple example: `SELECT name, age FROM users WHERE city = 'New York';` Most developers stop at writing queries. But the real growth starts when you understand what happens *under the hood* 👇 --- ⚙️ **𝗦𝘁𝗲𝗽 𝟭: 𝗧𝗿𝗮𝗻𝘀𝗽𝗼𝗿𝘁 𝗦𝘂𝗯𝘀𝘆𝘀𝘁𝗲𝗺** The moment you hit “Run”, your query doesn’t jump straight into the database. It first lands in the Transport Subsystem — the gatekeeper. ✅ Manages client connections ✅ Authenticates & authorizes requests ✅ Decides whether your query is allowed to proceed --- 🧠 **𝗦𝘁𝗲𝗽 𝟮: 𝗤𝘂𝗲𝗿𝘆 𝗣𝗿𝗼𝗰𝗲𝘀𝘀𝗼𝗿** This is where your SQL gets *understood*. It has two key components: 🔹 **𝗤𝘂𝗲𝗿𝘆 𝗣𝗮𝗿𝘀𝗲𝗿** Breaks your query into parts (SELECT, FROM, WHERE) Checks syntax and builds a parse tree 🔹 **𝗤𝘂𝗲𝗿𝘆 𝗢𝗽𝘁𝗶𝗺𝗶𝘇𝗲𝗿** Validates tables/columns (semantic checks) Figures out the *most efficient way* to run your query 🎯 Output: An optimized execution plan --- 🚀 **𝗦𝘁𝗲𝗽 𝟯: 𝗘𝘅𝗲𝗰𝘂𝘁𝗶𝗼𝗻 𝗘𝗻𝗴𝗶𝗻𝗲** Now the plan turns into action. The Execution Engine: ✅ Follows the execution plan step-by-step ✅ Coordinates with lower layers ✅ Collects and merges results --- 💾 **𝗦𝘁𝗲𝗽 𝟰: 𝗦𝘁𝗼𝗿𝗮𝗴𝗲 𝗘𝗻𝗴𝗶𝗻𝗲** This is where the actual data work happens. Think of it as a team working behind the scenes: 👨💼 Transaction Manager → ensures consistency 🔒 Lock Manager → prevents conflicts ⚡ Buffer Manager → fetches data from memory/disk 🧾 Recovery Manager → logs for rollback & recovery --- 🔍 The key insight? Your SQL query is not just a command. It’s a *journey through multiple layers of abstraction, optimization, and coordination.* And understanding this is what separates: 👉 Query writers from system thinkers --- 💬 Curious — what else would you add to this journey? #SQL #Databases #BackendEngineering #SystemDesign #SoftwareEngineering
Like Comment
To view or add a comment, sign in
Ajaychandra Arekal
3d
Report this post
⏱️ I once wrote a SQL query that took around 12 minutes to run. The output was correct. But, the execution was flawed. I remember staring at the loading screen thinking, Is this normal? I discussed it with teammates, searched online for better approaches, and rewrote the same query. The result came back in under 5 minutes. That moment taught me something I had been ignoring: getting the right answer is only half the job. Getting it efficiently is the other half. Here are a few small things that helped me write faster queries: ● Filter early, not late : Apply WHERE conditions before joins whenever possible. Less data entering a join means less work for the database. For instance, think of joining an orders table with a customers table and then filtering for orders placed in 2024. Moving that date filter before the join reduces the data going into the join significantly. ● Avoid SELECT * : Only retrieve the columns you actually need. Pulling unnecessary columns wastes memory and processing time, especially on wide tables. For instance, a product table with 60+ columns queried with SELECT * just to display a name and price is doing a lot of unnecessary work. Selecting product_name, price is all that is needed. ● Joins become expensive on large data : Joins are powerful, but they are not free. The larger the tables, the more work the compute unit has to do. Reducing rows before joining can improve performance a lot. For instance, joining a raw transactions table with 8 million rows to a users table before filtering is expensive. Filter down to the relevant users first, then join the smaller result. ● Check indexes before rewriting everything: If the column used in your filter or join does not have an index, the database may need to scan the entire table row by row. In many cases, adding the right index can improve speed significantly. Example Scenario: Imagine a library with no catalog system. To find one book, you would have to walk through every single shelf. An index works the same way. It gives the database a shortcut to find rows without scanning the entire table. Query optimization is one of those topics where the more you learn, the more you realize there is still left to understand. What is a SQL lesson that took you longer than it should have to learn? Curious to hear 👇 #SQL #QueryOptimization #Analytics #DataScience #Programming
2 Comments
Like Comment
To view or add a comment, sign in
Rushikesh Gund
1w
Report this post
Day 10 of my SQL Journey 🚀 Today’s challenge: The classic "Article Views I" problem. For today's solution, I focused on straightforward data filtering and deduplication in SQL. Sometimes the most effective queries are the ones that leverage core relational logic to compare columns within the exact same row! 🧠 My Approach: Select the author_id column from the Views table and immediately alias it as id to match the required output format. Use a WHERE clause to filter the dataset, keeping only the rows where the author_id is strictly equal to the viewer_id (meaning the author viewed their own article). Apply the DISTINCT keyword to the SELECT statement. Because an author might view their own article multiple times (creating multiple identical rows), this ensures their ID only appears once in the final result set. Finally, use the ORDER BY id ASC clause to guarantee the results are sorted in ascending order. ⚡ Key Learnings & SQL Gotchas: Row-Level Comparisons: We often use WHERE clauses to compare a column against a static value (like age > 18), but this problem is a great reminder that you can compare two different columns against each other within the exact same row. The Necessity of DISTINCT: It is incredibly easy to overlook duplicate data when you are focused on the filtering logic. Always ask yourself, "Can this event happen more than once in the dataset?" If yes, DISTINCT or GROUP BY is your best friend for cleaning up the final output. 📌 Expected Complexity: Time: O(N log N) — where N is the number of rows. While the row-by-row filtering is O(N), applying DISTINCT and the final ORDER BY clause requires the database engine to sort or hash the results, which dictates the overall time complexity. Space: O(U) — where U is the number of unique authors who viewed their own work. The database must allocate temporary memory to process the deduplication and store the final sorted result set.
Like Comment
To view or add a comment, sign in

1,392 followers

27 Posts

View Profile Follow

More Relevant Posts

Explore related topics

Explore content categories