Arsh Singhal’s Post

1mo

I spent two hours optimizing a slow query. Indexes. Rewrites. Stack Overflow rabbit holes. Then a senior engineer looked over my shoulder and typed two words. EXPLAIN ANALYZE. And just like that — the database told us exactly what it was doing. Every step. Every scan. Every place it was working too hard. I had been guessing. The answer was sitting in the query planner the whole time. That moment changed how I debug slow queries forever. I stopped assuming. Started asking the database itself. Because here's the thing nobody tells you early on — your database has opinions about your queries. It has a plan. A cost estimate. A reason it's slow. And it will tell you all of it. If you just know to ask. I don't touch a slow query without it now. Feel like I was handed a cheat code two years too late. What's the tool or command that made you feel like you'd been doing things the hard way all along? #SoftwareEngineering #BackendDevelopment #Databases #SQL #EngineeringLife #TechLessons #CodeLife

To view or add a comment, sign in

More Relevant Posts

Reza Bashiri
1w
Report this post
Most teams I've worked with underestimate how much time they lose to poor database indexing strategy. You'll write query after query, optimize the application logic, add caching layers—all while your table scans are silently murdering performance in production. The worst part? A junior dev can spot it in 5 minutes with the right tools, but nobody thinks to look. The real skill isn't knowing every index type. It's building the habit of checking execution plans before deployment and understanding which columns actually get filtered or joined. I've seen 30-second queries drop to 50ms with a single composite index. Once you internalize that patterns repeat across your codebase, you stop writing slow code in the first place. The mistake isn't complexity—it's ignoring the fundamentals because they feel boring. What's the worst performance issue you've inherited that turned out to be a missing index? #Database #SQL #Performance #BackendDevelopment #DatabaseOptimization
Like Comment
To view or add a comment, sign in
Tarun Kumar Korimi
1w
Report this post
CTEs have quietly become my favorite SQL feature. Not because they're fancy, but because I can come back to a query three months later and actually understand what past-me was doing. 🧐 Instead of one monstrous nested subquery, you get named blocks that read top to bottom. customers_last_year AS (...), their_orders AS (...), final select. That's it. 👇 Recursive CTEs took me longer to warm up to. I avoided them for months because the syntax looked intimidating. Then I had to flatten an employee-manager hierarchy and spent an afternoon fighting it with self-joins before giving up and trying a recursive CTE. Took about 8 lines. Should have learned it sooner. 🔁🤓 Fair warning: they're not always faster. A temp table or indexed subquery sometimes wins on performance. But for making queries you won't hate opening later, CTEs are the move. 💡 #SQL #CTEs #DataAnalytics #Programming
Like Comment
To view or add a comment, sign in
Muhammad Zohaib
5d Edited
Report this post
I spent the last few weeks building something I'm genuinely proud of. It started with a simple question: what does a production-style data pipeline actually look like when you build it from scratch? So I built one. 𝐎𝐩𝐬𝐏𝐮𝐥𝐬𝐞-𝐍𝐘𝐂-𝐓𝐚𝐱𝐢-𝐏𝐢𝐩𝐞𝐥𝐢𝐧𝐞 — a modular ETL pipeline that pulls NYC Yellow Taxi trip data, cleans it, transforms it, and loads it into a SQL Server database for analysis. Here's what I learned along the way: → Clean architecture isn't optional. When your pipeline breaks at 2am, you'll thank yourself for writing modular code. → The pipeline fails loudly, not silently. HTTP errors, missing values, duplicates — nothing slips through quietly. Because bad data that goes unnoticed is worse than a pipeline that stops. → Logging is your best friend. If you can't observe it, you can't debug it. → A fail-fast strategy saves hours. If extract fails, nothing else runs. Simple. Brutal. Effective. Tech I used: Python · Pandas · Parquet · MSSQL Server · Requests · Custom logging The pipeline has 3 stages: Extract → you enter a month and year, the pipeline fetches the exact Parquet file for that period — no hardcoding, no manual downloads Transform → deduplicates, cleans nulls, engineers features, aggregates revenue per day Load → writes structured, clean data directly into MSSQL Server — query-ready from day one GitHub link in the comments 👇 #DataEngineering #ETL #Datapipeline #Python #MSSQL #DataWarehouse #LearningInPublic

2 Comments
Like Comment
To view or add a comment, sign in
Gurdeep Singh Bhatia
1w Edited
Report this post
🚨 We had 12 database tables… for the SAME data. And it kept growing every month. I was debugging a backend system where something felt off. Everything worked… but nothing felt right. Then I saw it. ❌ JanuaryData table ❌ FebruaryData table ❌ MarchData table … and this pattern continued Every. Single. Month. At first glance, it looked organized. In reality? 💥 It was a scaling disaster. As the system grew: – Queries became messy and repetitive – Business logic had to be duplicated – Adding a feature meant touching multiple tables – Maintenance became painful 💡 The real issue wasn’t performance. It was bad data modeling. 🛠️ What I changed Instead of creating a new table every month: ✅ Built a single main model ✅ Created a monthly records model ✅ Connected them using a Foreign Key (One-to-Many relationship) Now: 👉 One entity → multiple monthly entries 👉 Clean, scalable, maintainable structure 🎯 The impact ✔️ Removed duplication ✔️ Simplified queries ✔️ Faster development ✔️ Future-proof design 💡 Lesson: If your database is growing with more tables for the same logic… You don’t have a scaling system. You have a design problem. Good backend engineers don’t just fix bugs. They fix foundations. Have you ever seen something like this in production? 😅 What was the worst DB design you’ve fixed? 👇 Drop it below #Python #Django #DatabaseDesign #BackendDevelopment #SoftwareEngineering #PostgreSQL #WebDevelopment #FullStackDeveloper #DataModeling #CleanCode #TechTips #Programming #OpenToWork #RemoteWork #AWS
Like Comment
To view or add a comment, sign in
SUMITABH BISWAS
1mo
Report this post
A week ago, I posted about a small database I was building to understand how databases actually work. This week… it got real. It now survives crashes mid-write and supports transactions. What started as: “let me try storing data in columns” is now: • writing through a WAL (write-ahead log) • replaying state after crashes • supporting multi-step atomic transactions • parsing queries → building an AST → planning → executing • deciding between index vs full scan • running aggregations directly on columnar data Somewhere along the way, it stopped feeling like a project and started feeling like a database. Still a long way to go: • no server / multi-user support yet • no joins • vector search is still brute-force • lots to improve in performance But honestly, building this has been one of the best ways to actually understand databases. Not just use them. If you’ve worked on databases / storage engines / Rust — I’d really value your feedback. Repo: https://lnkd.in/dEVW4aDB

GitHub - sumitabh1710/VTF-db github.com

7 Comments
Like Comment
To view or add a comment, sign in
Nikhil Parakh
3w
Report this post
Working on this with him has been one of the most educational things I’ve done. Biggest shift for me: A DB is not just “store + query”. It is mostly about guarantees under failure. Things that changed how I think: Schema validation is easy; preserving correctness after crash/replay is hard. WAL design affects almost everything: write path, recovery, compaction, tests. Query performance is less about syntax, more about planner decisions and data layout. “Fast” means nothing without workload-specific benchmarks and repeatability. Observability matters early. If you can’t measure replay/plan/scan behavior, you’re guessing. What I’m excited about next: better transaction semantics ANN/vector indexes (instead of brute force) stronger concurrency model production-grade benchmarking and recovery testing If anyone has experience with storage engines in Rust, especially around WAL recovery edge-cases and planner heuristics, I would love to learn from your feedback.

SUMITABH BISWAS

Full Stack Engineer
1mo

A week ago, I posted about a small database I was building to understand how databases actually work. This week… it got real. It now survives crashes mid-write and supports transactions. What started as: “let me try storing data in columns” is now: • writing through a WAL (write-ahead log) • replaying state after crashes • supporting multi-step atomic transactions • parsing queries → building an AST → planning → executing • deciding between index vs full scan • running aggregations directly on columnar data Somewhere along the way, it stopped feeling like a project and started feeling like a database. Still a long way to go: • no server / multi-user support yet • no joins • vector search is still brute-force • lots to improve in performance But honestly, building this has been one of the best ways to actually understand databases. Not just use them. If you’ve worked on databases / storage engines / Rust — I’d really value your feedback. Repo: https://lnkd.in/dEVW4aDB

GitHub - sumitabh1710/VTF-db github.com
Like Comment
To view or add a comment, sign in
Shivshankar Jha
1mo
Report this post
A small SQL issue taught me a big lesson in production ⚠️ Recently, I worked on a case where an application was running slow in production, even though everything was working fine earlier. After digging deeper, the issue was not in the application logic but in the SQL query. Problem: 👉 Inefficient query + missing optimization 👉 High data volume causing delay Solution: ✔ Optimized the query ✔ Improved data retrieval performance ✔ Application response time improved significantly Lesson: In backend systems, even a small SQL inefficiency can impact the entire application. Still learning and improving every day as a backend engineer 🚀 Have you faced something similar in production? 🤔 #SQL #DotNet #BackendDevelopment #SoftwareEngineering #ProductionSupport
5 Comments
Like Comment
To view or add a comment, sign in
Rushikesh Wani
1w
Report this post
Earlier this week, I was debugging a legacy SQL script. It was a mess of subqueries wrapped inside subqueries—like trying to unwrap an endless stack of boxes just to find one small item. It was hard to read, impossible to debug, and slowed the whole team down. The fix? I refactored the entire thing into clean CTEs (Common Table Expressions). Here is why I’ve made the switch: Readability: CTEs let you name your data blocks. You read the code from top to bottom, like a story, not from the inside out. Easy Debugging: You can test each "block" individually. No more untangling a web of parentheses. Team Speed: If a teammate can understand your query in 30 seconds instead of 30 minutes, you’ve just saved the company money. In Data Science Engineering, "clean" is often better than "clever." #SQL #DataEngineering #DataScience #CleanCode #TechTips
Like Comment
To view or add a comment, sign in
Anushka Srivastava
3w
Report this post
🚨 SELECT * is silently killing your SQL queries. I've seen a 40ms query turn into a 9-second nightmare - all because of one hidden TEXT column. And the culprit? A lazy SELECT * in production. Here's what most developers don't realize: → Every unused column travels across the network on every query → LOB columns (TEXT, BLOB) silently explode your RAM usage → Databases can't optimize what they don't know you need → Schema changes break your app - often without a single error thrown In a benchmark of 10,000 rows with 22 columns: SELECT * consumed 6× more memory than explicit column lists. The fix is simple. The discipline is the hard part. Name your columns. Every. Single. Time. I put together a 7-slide breakdown covering: ✅ Why SELECT * hurts performance ✅ Real benchmark numbers ✅ The breaking changes it causes ✅ The exact fix with code examples ✅ 3 production SQL rules to live by Swipe through the doc and save it for your next code review. What's the worst SELECT * story from your production database? Drop it in the comments - I'd love to hear it. 👇 #SQL #DataAnalysis #QueryOptimization #Backend #DatabasePerformance #Programming #TechTips
Like Comment
To view or add a comment, sign in
Aman verma
2w
Report this post
SQL is 50 years old and still the most underated skill in tech. 👀 I've seen people with 5 years of experience who can't write a proper JOIN. And I've seen analysts with zero CS degrees outperform engineers just because they mastered SQL. Quick cheat sheet that took me years to internalize: 📋 🔹 Use WHERE before GROUP BY to filter rows early 🔹 HAVING is for filtering AFTER aggregation 🔹 Window functions (ROW_NUMBER, RANK, LAG) > subqueries 90% of the time 🔹 EXPLAIN ANALYZE is your best friend before pushing any query to prod 🔹 Index your JOIN columns. Always. 🙏 SQL doesn't care about your framework or your favourite language. Bad queries will tank your app regardless. ⚡ What SQL trick took you way too long to learn? 👇 #SQL #Database #BackendDevelopment #SoftwareDevelopment #DataEngineering
Like Comment
To view or add a comment, sign in

3,584 followers

68 Posts

View Profile Follow

Arsh Singhal’s Post

More Relevant Posts

Explore related topics

Explore content categories