SQL Server Deletion Strategies for Large Tables

💣 𝗗𝗲𝗹𝗲𝘁𝗶𝗻𝗴 𝗠𝗶𝗹𝗹𝗶𝗼𝗻𝘀 𝗼𝗳 𝗥𝗼𝘄𝘀 𝗶𝗻 𝗦𝗤𝗟? 𝗧𝗵𝗶𝘀 𝗖𝗮𝗻 𝗖𝗿𝗮𝘀𝗵 𝗬𝗼𝘂𝗿 𝗔𝗽𝗽 😬 A very common mistake I see beginners make 👇 ```sql DELETE FROM big_table; ``` Or even worse… ```sql DELETE FROM big_table WHERE condition; ``` 👉 On tables with millions of rows 💥 🔴 𝗪𝗵𝗮𝘁 𝗚𝗼𝗲𝘀 𝗪𝗿𝗼𝗻𝗴? • 🚫 Table locks → entire app gets blocked • 🧱 Transaction log explosion • 🐢 Long-running queries • 💥 Possible downtime 👉 Your application starts timing out… users panic… 😅 🧠 𝗪𝗵𝘆 𝗧𝗵𝗶𝘀 𝗛𝗮𝗽𝗽𝗲𝗻𝘀 SQL Server doesn’t delete instantly 👉 It logs every row deletion 👉 Keeps locks until transaction completes So deleting 10M rows = 👉 10M log entries + long locks 👶 𝗕𝗲𝗴𝗶𝗻𝗻𝗲𝗿 𝗠𝗶𝘀𝘁𝗮𝗸𝗲 “Just run one DELETE… it will be faster” ❌ 👉 Actually it’s the slowest + riskiest way ⚙️ 𝗖𝗼𝗿𝗿𝗲𝗰𝘁 𝗔𝗽𝗽𝗿𝗼𝗮𝗰𝗵𝗲𝘀 ✅ 𝗕𝗮𝘁𝗰𝗵 𝗗𝗲𝗹𝗲𝘁𝗶𝗼𝗻 ```sql WHILE 1=1 BEGIN DELETE TOP (10000) FROM big_table WHERE condition; IF @@ROWCOUNT = 0 BREAK; END ``` 👉 Deletes in chunks → less locking → safer ✅ 𝗨𝘀𝗲 𝗧𝗥𝗨𝗡𝗖𝗔𝗧𝗘 (𝗶𝗳 𝗮𝗹𝗹 𝗱𝗮𝘁𝗮 𝗿𝗲𝗺𝗼𝘃𝗲𝗱) ```sql TRUNCATE TABLE big_table; ``` 👉 Minimal logging 👉 Super fast 🚀 ⚠️ But no WHERE clause ✅ 𝗣𝗮𝗿𝘁𝗶𝘁𝗶𝗼𝗻 𝗗𝗿𝗼𝗽 (𝗔𝗱𝘃𝗮𝗻𝗰𝗲𝗱) 👉 Drop old partitions instead of deleting rows • Near instant • Best for large datasets ✅ 𝗢𝗳𝗳-𝗣𝗲𝗮𝗸 𝗘𝘅𝗲𝗰𝘂𝘁𝗶𝗼𝗻 👉 Run during low traffic 👉 Avoid impacting users 💡 𝗣𝗿𝗼 𝗧𝗶𝗽 👉 Monitor: • Transaction log size • Locks (sp_who2 / DMVs) • Execution time 🎯 𝗙𝗶𝗻𝗮𝗹 𝗧𝗵𝗼𝘂𝗴𝗵𝘁 Deleting data is easy… 👉 Deleting it safely at scale = real skill 💡 Junior → Runs DELETE Senior → Designs deletion strategy I’m on a mission to make people stronger in tech and data 💪📊 👉 Explore more at https://lnkd.in/gqVua8Tu #SQLServer #DataEngineering #Database #PerformanceTuning #BigData #ETL #SQL #DataEngineer #TechLearning #BigDataYatra

To view or add a comment, sign in

More Relevant Posts

Dinesh Sahu
1mo
Report this post
🚀 The SQL Roadmap: From Zero to Expert To truly master SQL, you must progress through these core layers: • The Foundation: Understand DDL (Data Definition) for managing structures like tables and DML (Data Manipulation) for handling the data itself. • Querying & Filtering: Mastering SELECT, WHERE, and logical operators like AND/OR to extract exactly what you need. • Aggregations & Grouping: Using functions like SUM(), AVG(), and COUNT() with GROUP BY to generate summary statistics. • Advanced Joins: Moving beyond INNER JOIN to master LEFT, RIGHT, and FULL OUTER joins for complex data relationships. 💡 Pro-Level Concepts to Ace Your Interview If you want to stand out, focus on these advanced topics often asked by top tech companies: • Window Functions: Commands like RANK(), DENSE_RANK(), and LEAD/LAG allow for powerful calculations across rows without collapsing your data. • CTEs vs. Subqueries: Common Table Expressions (CTEs) are often more readable and efficient for complex, multi-step queries. • Performance Optimization: Understanding Indexes (Clustered vs. Non-Clustered) to speed up data retrieval. 🧠 Can You Answer These? Interviewers love "Conceptual" questions to test your depth. Do you know the difference between: WHERE vs. HAVING? (Row-level vs. Aggregate filtering). DELETE vs. TRUNCATE? (Logged row removal vs. fast table clearing). UNION vs. UNION ALL? (Removing duplicates vs. keeping them for speed). 🛠️ Practice Resources Knowledge is nothing without practice. Check out these platforms: Beginner: W3Schools, SQLBolt, SQLZoo. Intermediate/Expert: LeetCode (Top 50 SQL Plan), DataLemur, and HackerRank. SQL isn't just about writing code; it's about solving problems and uncovering insights. What SQL concept took you the longest to "click"? Let’s discuss in the comments! 👇 👉 Follow: Dinesh Sahu #SQL #DataScience #DataEngineering #InterviewPrep #TechCareers #DatabaseManagement #CareerGrowth

41 Comments
Like Comment
To view or add a comment, sign in
Sandhya Paghdar
1mo
Report this post
A lot of you have been reaching out asking where to start with data and SQL. So I wrote this down. Most people learn INNER JOIN, LEFT JOIN, maybe a SELF JOIN - and stop there. That's enough to pass an interview. It's not enough to build real pipelines. The joins that actually matter in production? The ones that break your data when you get them wrong? Those deserve a deeper look. I covered all of it in my latest blog - the full spectrum of SQL joins, when to use each one, and the edge cases nobody warns you about. Check out the blog Link in the comment section. If you're starting out or levelling up - this one's for you. Comment "JOINS" below and I'll drop practice questions your way. And tell me - what do you want me to go on next? Window functions? Indexing? Query optimization? CTEs? or something else, I’m writing for you. ↓ Drop it in the comments. #SQL #DataEngineering #DataCareer #Analytics #

The SQL Playbook: Mastering Basics & Every Join You’ll Ever Need medium.com

2 Comments
Like Comment
To view or add a comment, sign in
Chamathka Maddugoda
3w Edited
Report this post
Do you use GROUP BY to find duplicates in SQL? That’s usually the first thing most of us learn and it works well to detect duplicates. But here’s something we get stuck: 👉 What if you actually need to remove duplicates? 👉 How do you identify which rows are the exact duplicates? GROUP BY won’t help much there as it only gives counts, not row-level detail. To handle this properly, you need a way to work at the row level and that’s where ROW_NUMBER() with PARTITION BY becomes useful. I’ve written a short 2-minute tech blog explaining this with a simple example.If you're learning SQL or working with real datasets, this might be useful 👇 #SQL #ROW_NUMBER() #Tech_blog

Filtering Duplicate Rows in SQL with ‘ROW_NUMBER()’ medium.com

2 Comments
Like Comment
To view or add a comment, sign in
Sumit Sharma
3w
Report this post
Stop overcomplicating SQL. It all boils down to these 4 pillars. ⬇️ Most people think SQL is just about "SELECT *". But if you want to master data, you need to understand the whole ecosystem: 🔹 DQL (Querying): How you ask the database for answers. 🔹 DML (Manipulation): How you add, change, or delete the actual data. 🔹 DDL (Structure): How you build the "skeleton" or blueprint of the database. 🔹 Relationships: How different tables "talk" to each other using Keys. Whether you're a Data Analyst, Dev, or PM, these fundamentals never change. Which of these was the hardest for you to wrap your head around when you started? #SQL #DataAnalytics #DataEngineering #CodingTips #TechCommunity
Like Comment
To view or add a comment, sign in
Makkena Lakshmi pavithra
2w
Report this post
Hello everyone! 👋 Welcome to Day 4 of #100DaysOfSQL 🚀 👉 Topic: SQL JOINS SQL JOINS are used to combine data from multiple tables based on a related column. They are essential for working with relational databases. --- 🔹 Types of JOINS: 1. INNER JOIN Returns only matching records from both tables. 2. LEFT JOIN (LEFT OUTER JOIN) Returns all records from the left table and matching records from the right table. 3. RIGHT JOIN (RIGHT OUTER JOIN) Returns all records from the right table and matching records from the left table. 4. FULL JOIN (FULL OUTER JOIN) Returns all records when there is a match in either table. --- 🔹 Example Tables: Employees - EmpID - Name - DeptID Departments - DeptID - DeptName --- 🔹 Example Queries: ✔️ INNER JOIN SELECT e.Name, d.DeptName FROM Employees e INNER JOIN Departments d ON e.DeptID = d.DeptID; ✔️ LEFT JOIN SELECT e.Name, d.DeptName FROM Employees e LEFT JOIN Departments d ON e.DeptID = d.DeptID; ✔️ RIGHT JOIN SELECT e.Name, d.DeptName FROM Employees e RIGHT JOIN Departments d ON e.DeptID = d.DeptID; ✔️FULL OUTER JOIN SELECT e.Name, d.DeptName FROM Employees e FULL OUTER JOIN Departments d ON e.DeptID = d.DeptID; --- 💡 Real-Life Example: - Employees table = Employees list - Departments table = Department details - JOIN = Connecting employees to their departments --- 🚀 Mastering JOINS helps you: ✔️ Combine data efficiently ✔️ Write powerful queries ✔️ Solve real-world business problems #SQL #DataAnalytics #Database #Learning #Tech #100DaysOfCode
Like Comment
To view or add a comment, sign in
Srinath M.
5d
Report this post
Day 122 Today I was revisiting something that once gave me sleepless nights (and very slow dashboards): SQL Optimization Techniques That Actually Made My Queries Faster Back in my earlier experience, I remember dealing with a classic support issue — “Why is this query taking forever?” And of course… it was running fine yesterday. After digging through logs, execution plans, and a lot of trial and error, here are a few techniques that genuinely made a difference: 1. Indexing (the real hero) At one point, a query scanning millions of rows was taking minutes. Adding the right index turned it into seconds. Lesson learned: Not all heroes wear capes, some are just well-placed indexes. 2. Avoid SELECT * Pulling unnecessary columns was slowing things down more than expected. Fetching only required columns reduced memory usage and improved speed noticeably. 3. Optimizing JOINs Improper joins were causing massive data explosions. Switching to the right join type and ensuring proper join conditions helped control the chaos. 4. Filtering Early Applying WHERE conditions as early as possible reduced the dataset before heavy operations. Less data = less pain. 5. Using EXPLAIN Plans Understanding how the database executes queries was a game changer. It felt like finally reading the answer sheet instead of guessing. 6. Handling NULLs and Data Types Carefully Mismatched data types and NULL-heavy columns were silently impacting performance. 7. Breaking Down Complex Queries Sometimes one giant query isn’t smart. Breaking it into smaller steps (CTEs or temp tables) improved both readability and performance. Looking back, most performance issues weren’t “database problems” — they were “query writing problems.” Still learning, still optimizing, and still occasionally staring at a query wondering “why are you like this?” If you’ve faced similar issues or have optimization tricks, I’d love to hear them. #SQL #DataEngineering #ETL #DatabaseOptimization #QueryPerformance #TechLearning #Upskilling #DataAnalytics #SQLTips #DataEngineer #LearningInPublic #CareerGrowth #ITSupport #ApplicationSupport #DataPipeline #BigData #TechCommunity
Like Comment
To view or add a comment, sign in
Lakshmanan M
3w
Report this post
🚀 SQL Window Functions – Deep Dive (Hands-on Practice) Taking my SQL skills further by working on Window Functions using real datasets 👇 🔹 What are Window Functions? They perform calculations across a set of rows while retaining individual rows (unlike GROUP BY). OVER() → defines the window PARTITION BY → like GROUP BY (without collapsing rows) ORDER BY → defines order within partition 🔹 1. ROW_NUMBER() – Unique row number per group SELECT product, unitPrice, ROW_NUMBER() OVER (PARTITION BY product ORDER BY unitPrice DESC) AS ROW_Number FROM samples.bakehouse.sales_transactions; 🔹 2. RANK() – Same rank for ties (with gaps) SELECT product, unitPrice, RANK() OVER (PARTITION BY product ORDER BY unitPrice DESC) AS RANK FROM samples.bakehouse.sales_transactions; 🔹 3. DENSE_RANK() – Same rank for ties (no gaps) SELECT product, unitPrice, DENSE_RANK() OVER (PARTITION BY product ORDER BY unitPrice DESC) AS DENSE_RANK FROM samples.bakehouse.sales_transactions; 🔹 4. SUM() OVER() – Total per partition SELECT product, unitPrice, SUM(unitPrice) OVER (PARTITION BY product) AS SUM_Over FROM samples.bakehouse.sales_transactions; 🔹 5. AVG() OVER() – Average per partition SELECT product, unitPrice, AVG(unitPrice) OVER (PARTITION BY product) AS Avg_Over FROM samples.bakehouse.sales_transactions; 🔹 6. LAG() / LEAD() – Previous & Next values SELECT product, unitPrice, LAG(unitPrice) OVER (PARTITION BY product) AS Pre_Sales FROM samples.bakehouse.sales_transactions; SELECT product, unitPrice, LEAD(unitPrice) OVER (PARTITION BY product) AS Next_Sales FROM samples.bakehouse.sales_transactions; 💡 Key Learnings: Window functions are powerful for analytics & trend analysis Unlike GROUP BY, they don’t reduce rows Widely used in ranking, running totals, comparisons 🔥 Most asked interview use case: 👉 Top N records per group using ROW_NUMBER() #SQL #DataAnalytics #DataEngineering #WindowFunctions #PySpark #Databricks #Learning #100DaysOfCode
Like Comment
To view or add a comment, sign in
Ganesh RT
3w
Report this post
🚀 Ever wondered how SQL queries actually execute behind the scenes? Most developers write queries in this order: 👉 SELECT → FROM → WHERE → GROUP BY → HAVING → ORDER BY But the actual logical execution order is very different: 1️⃣ FROM (including JOINs) 2️⃣ WHERE (filter rows) 3️⃣ GROUP BY (aggregate data) 4️⃣ HAVING (filter groups) 5️⃣ SELECT (choose columns) 6️⃣ WINDOW FUNCTIONS (ROW_NUMBER, RANK, etc.) 7️⃣ ORDER BY (sort results) 8️⃣ LIMIT / OFFSET (restrict output) 9️⃣ DISTINCT (remove duplicates) 💡 Understanding this flow helps you: ✔️ Write optimized queries ✔️ Debug issues faster ✔️ Avoid common mistakes in aggregations & filters If you're working with data using SQL, mastering this concept is a game changer 🔥 📌 Save this for future reference & share with your network! #SQL #AzureDataEngineering #DataAnalytics #Database #Learning #Azuresql #BigData #Analytics #DataScience
1 Comment
Like Comment
To view or add a comment, sign in
Rajeev Kumar
2w
Report this post
𝗗𝗮𝘆 1/30: 𝗦𝗤𝗟 𝗙𝘂𝗻𝗱𝗮𝗺𝗲𝗻𝘁𝗮𝗹𝘀🔥:𝗦𝘁𝗮𝗿𝘁𝗶𝗻𝗴 𝗪𝗶𝘁𝗵 𝗕𝗮𝘀𝗶𝗰𝘀 - Before writing complex queries, you need to understand how SQL is structured. These 5 command types are the base of everything. 1️⃣ 𝗗𝗗𝗟 (𝗗𝗮𝘁𝗮 𝗗𝗲𝗳𝗶𝗻𝗶𝘁𝗶𝗼𝗻 𝗟𝗮𝗻𝗴𝘂𝗮𝗴𝗲) 👉 Used to define and manage database structure. • CREATE – Create database objects (DATABASE, TABLE, INDEX, VIEW) • ALTER – Modify structure (ADD, MODIFY, DROP COLUMN) • DROP – Delete database objects (TABLE, DATABASE) • TRUNCATE – Remove all records from a table (no condition) • RENAME – Rename database objects • Constraints – Rules on data (PRIMARY KEY, FOREIGN KEY, UNIQUE, NOT NULL, CHECK, DEFAULT) 2️⃣ 𝗗𝗠𝗟 (𝗗𝗮𝘁𝗮 𝗠𝗮𝗻𝗶𝗽𝘂𝗹𝗮𝘁𝗶𝗼𝗻 𝗟𝗮𝗻𝗴𝘂𝗮𝗴𝗲) 👉 Used to insert, update, and delete data. • INSERT – Add new records (single/bulk) • UPDATE – Modify existing records (with conditions) • DELETE – Remove records (specific or all) 3️⃣ 𝗗𝗤𝗟 (𝗗𝗮𝘁𝗮 𝗤𝘂𝗲𝗿𝘆 𝗟𝗮𝗻𝗴𝘂𝗮𝗴𝗲) 👉 Used to retrieve data from database. • SELECT – Fetch data (WHERE, DISTINCT, ORDER BY, GROUP BY, HAVING, LIMIT/TOP) 4️⃣ 𝗗𝗖𝗟 (𝗗𝗮𝘁𝗮 𝗖𝗼𝗻𝘁𝗿𝗼𝗹 𝗟𝗮𝗻𝗴𝘂𝗮𝗴𝗲) 👉 Used to control access and permissions. • GRANT – Provide access to users • REVOKE – Remove access from users 5️⃣ 𝗧𝗖𝗟 (𝗧𝗿𝗮𝗻𝘀𝗮𝗰𝘁𝗶𝗼𝗻 𝗖𝗼𝗻𝘁𝗿𝗼𝗹 𝗟𝗮𝗻𝗴𝘂𝗮𝗴𝗲) 👉 Used to manage transactions in database. • COMMIT – Save changes permanently • ROLLBACK – Undo changes • SAVEPOINT – Set point for partial rollback 💡𝗦𝗤𝗟 𝗶𝘀 𝗻𝗼𝘁 𝗷𝘂𝘀𝘁 𝗾𝘂𝗲𝗿𝘆𝗶𝗻𝗴. 𝗜𝘁’𝘀 𝘀𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲 + 𝗰𝗼𝗻𝘁𝗿𝗼𝗹 + 𝗿𝗲𝗹𝗶𝗮𝗯𝗶𝗹𝗶𝘁𝘆. Follow for Day 2 🚀 #SQL #DataEngineering #LearnSQL #Database #Analytics #Tech #DataAnalytics
2 Comments
Like Comment
To view or add a comment, sign in
Akshita Goel
3w
Report this post
NULLs in SQL Joins: The Interview Question That Trips Everyone 😵💫 If you’ve ever written a SQL query that looked perfect… but the output still felt off — this might be why. 👉 The culprit: NULL 🔍 The Problem You write a clean join: SELECT * FROM orders o LEFT JOIN users u ON o.user_id = u.id Looks correct, right? But suddenly: Some rows don’t match Data goes missing Counts feel off No errors. No warnings. Just wrong results. 💣 The Silent Killer: NULL In SQL, NULL doesn’t behave like a normal value. NULL = NULL → UNKNOWN NULL = 5 → UNKNOWN 👉 Not TRUE 👉 Not FALSE 👉 Just… ignored 🔗 Why this breaks joins If your join key contains NULL: o.user_id = u.id And o.user_id is NULL: 👉 The condition becomes UNKNOWN 👉 The join FAILS silently 👉 You get NULLs on the right side 😬 Real Impact Missing city names Incomplete mappings Wrong aggregations Misleading dashboards And the worst part? You don’t even realize it’s happening. 🧠 Key Learnings ✅ NULL ≠ NULL ✅ Use IS NULL, not = NULL ✅ NULL never matches in joins ✅ LEFT JOIN can hide issues ✅ Always validate join keys 💡 Pro Tip Before trusting your data: 👉 Check NULLs in join columns 👉 Validate mapping coverage 👉 Use COALESCE() when needed 🚀 Final Thought SQL doesn’t fail loudly. It fails silently. And NULL is one of the biggest reasons why. If you're working with data, mastering NULL handling isn’t optional — it's essential. #SQL #DataAnalytics #DataEngineering #Analytics #LearningSQL
Like Comment
To view or add a comment, sign in

10,491 followers

94 Posts

View Profile Connect

SQL Server Deletion Strategies for Large Tables

More Relevant Posts

Explore content categories