SQL Practice: Building a Histogram from Raw Activity Data

📊 SQL practice: building a histogram from raw activity data Continuing with DataLemur SQL challenges, I worked on a problem that involved analyzing how employees interact with a database by constructing a histogram of query activity. The goal was to determine how many employees executed N unique queries during a given time window (Q3 2023), including those with zero activity. I approached this in stages: • filtering query activity within the time window • counting distinct queries per employee • and then re-aggregating those results to build the final distribution Using a LEFT JOIN ensured that employees with no activity were included, which is critical when working with real-world datasets where absence of data is meaningful. The solution was accepted ✅, and it reinforced a pattern I’m seeing often: transforming granular event data into higher-level summaries that can support analysis and decision-making. This type of problem feels very aligned with analytics and data engineering workflows, where building reliable intermediate datasets is just as important as the final result. Thanks to @Nick Singh and the DataLemur team for the continued practice. And as always, I’m very grateful to @Luke Barousse — much of the SQL and PostgreSQL foundation I rely on comes from his teaching: [https://lnkd.in/dZwd87sd) 15 challenges in, and continuing to focus on writing queries that scale from raw events to structured insights. If you’re also working through SQL interview-style problems, I’ve been using DataLemur — happy to share a referral if useful. #SQL #PostgreSQL #DataEngineering #Analytics #LearningInPublic

2 Comments

Carlos D. Miranda 2w

There were commented lines on my previous post, and it's common sense to think that I've been solving these challenges by using AI. In this case, I had to use Claude as a guide only, and I have the full conversation with the model in case anyone would like to take a look at it. I had in my mind what had to be done, the logical steps, but I struggled with some of the syntax and how to wrap up the double aggregation using the two CTEs. I am being honest and true to myself, otherwise I wouldn't even bother posting this content and trying to show that my commitment to learning SQL and slowly making my way to my very first Data Engineering job. I work day in and day out for this target that I set for this year. Thank you all for your support and for taking the time to read my posts, I really appreciate it.

Nicolás Godoy 1w

Muy bien trabajando, felicidades!!

1 Reaction

See more comments

To view or add a comment, sign in

More Relevant Posts

Carlos D. Miranda
2w
Report this post
💊 SQL practice: turning business definitions into ranked metrics Continuing with DataLemur SQL challenges, I worked on a problem focused on identifying the most profitable products based on sales data. The task was to compute total profit per drug, defined as: total_sales - cogs (i.e. Cost of Goods Sold), and then rank the top 3 products by profitability. While the query itself is straightforward, the key step is correctly translating the business definition (profit) into a reliable calculation and then ordering the results accordingly. The solution was accepted ✅, and it’s a good reminder that many real-world data problems are less about complexity and more about correctly modeling the metric being asked for. This kind of pattern shows up frequently when working with financial or product data, where derived metrics (profit, margin, growth) drive decisions. Thanks to @Nick Singh and the DataLemur team for the continued practice. And as always, I’m very grateful to @Luke Barousse — much of the SQL and PostgreSQL foundation I rely on comes from his teaching: [https://lnkd.in/dZwd87sd) 18 challenges in, continuing to focus on expressing business logic clearly through SQL. If you’re also working through SQL interview-style problems, I’ve been using DataLemur — happy to share a referral if useful. #SQL #PostgreSQL #DataEngineering #Analytics #LearningInPublic
Like Comment
To view or add a comment, sign in
Carlos D. Miranda
2w
Report this post
📐 SQL practice: weighted averages and data type precision Continuing with DataLemur SQL challenges, I worked on computing the mean number of items per order — using aggregated data instead of raw rows. This required calculating a weighted average, where each item count is multiplied by its number of occurrences before dividing by the total number of orders. One key detail in this problem is handling data types correctly. Without explicit casting, integer division would truncate the result and lead to an incorrect mean. By casting the result to a numeric type before rounding, the calculation preserves the expected precision. The solution was accepted ✅, and it was a good reminder that correctness in SQL isn’t just about logic — it also depends on how the database evaluates expressions. This type of pattern shows up frequently when working with pre-aggregated data, where reconstructing metrics requires careful handling of weights and precision. Thanks to @Nick Singh and the DataLemur team for the continued practice. And as always, I’m very grateful to @Luke Barousse — much of the SQL and PostgreSQL foundation I rely on comes from his teaching: [https://lnkd.in/dZwd87sd) 17 challenges in, and continuing to focus on writing queries that are not just correct, but numerically reliable. If you’re also working through SQL interview-style problems, I’ve been using DataLemur — happy to share a referral if useful. #SQL #PostgreSQL #DataEngineering #Analytics #LearningInPublic
Like Comment
To view or add a comment, sign in
Rohith Vannawada
3w Edited
Report this post
🚀 Came across these SQL handwritten notes and they cover EVERYTHING. Saving this for life. Here's everything covered in 24 pages: ✅ What is SQL & why analysts love it ✅ SELECT, WHERE, LIMIT — the core building blocks ✅ Comparison & Logical Operators (LIKE, IN, BETWEEN, AND, OR, NOT) ✅ Arithmetic in SQL ✅ NULL values & how to handle them ✅ CREATE TABLE, INSERT INTO, UPDATE, DELETE ✅ Aliases, ORDER BY, Comments ✅ Aggregate Functions (COUNT, SUM, AVG, MIN, MAX) ✅ GROUP BY, HAVING Clause ✅ CASE Statement ✅ DISTINCT ✅ All types of JOINs (INNER, LEFT, RIGHT, CROSS, SELF) ✅ UNION, EXISTS, ANY, ALL ✅ INSERT INTO SELECT & IFNULL() Writing notes by hand is underrated. It forces you to actually understand — not just copy-paste. If you're learning SQL, start here. These are the fundamentals that every data analyst uses daily. 📌 Save this post so you don't lose it. 💬 Drop a "SQL" in the comments if you're on this journey too — let's connect! 🤝 Follow 👨💻 Rohith Vannawada for a regular curated feed of #dataAnalyst insights and practical tips to level up your data journey. #SQL #CareerGrowth #Programming #Upskilling #DataScience #DataAnalyst #DataAnalystInterview #ApacheSpark #InterviewPrep #DataEngineering #TechInterviewTips #data #dataengineering #dataengineerjobs #interviewquestions #interviewprep #newlearnings #dataengineerjobs #spark #azurecloud #datacleaning #pyspark #optimization #interviewsuccess #interviewskills #DataEngineering #DataScience #LinkedInLearning #DataProcessing #BigData #InterviewPrep #TechCareer #DataJobs #KnowledgeSharing #TechSkills #BigData #LearningResources #LinkedInLearning #DataScience #PySparkFunctions #CareerDevelopment #mapreduce #interviewguidance #interviewingskills #PySpark #SQL #InterviewPrep #DataAnalytics #TechCareers #SQLInterview #LearningNeverStops #DataAnalyst

3 Comments
Like Comment
To view or add a comment, sign in
Bhavya Katari
4d
Report this post
Mastering SQL: Key Techniques Every Data Professional Should Know SQL is more than just querying data—it’s about writing efficient, scalable, and optimized queries that power real-world systems. Here are some essential SQL techniques that every Data Engineer, Analyst, or Developer should master: Window Functions Use ROW_NUMBER(), RANK(), DENSE_RANK() to perform advanced analytics without collapsing rows. CTEs (Common Table Expressions) Break complex queries into readable blocks using WITH clauses for better maintainability. Joins Optimization Understand when to use INNER, LEFT, RIGHT, and FULL joins—and always optimize join conditions for performance. Indexing Strategies Proper indexing can drastically improve query performance. Know when NOT to over-index. Subqueries vs CTEs Choose wisely—CTEs often improve readability, while subqueries can sometimes perform better depending on execution plans. Aggregation with GROUP BY & HAVING Filter aggregated results efficiently and avoid unnecessary data processing. Query Execution Plans Always analyze execution plans to identify bottlenecks and optimize queries. Pro Tip: Writing SQL is easy—writing optimized SQL is what makes you stand out in real-world data systems. Let’s keep learning and building efficient data pipelines! #SQL #DataEngineering #DataAnalytics #Database #BigData #Cloud #ETL #DataScience #TechSkills
Like Comment
To view or add a comment, sign in
Telixia

2,601 followers
1w
Report this post
🚀 From SQL Basics to Real-World Optimization, All in One Place Data skills aren’t just “nice to have” anymore, they’re foundational. This resource breaks down 100 essential SQL concepts every data professional should understand, from core fundamentals to advanced, real-world problem solving. 📌 What’s inside: • 🔹 Foundations that matter Understanding databases, keys, constraints, and core SQL commands • 🔹 Querying like a pro Joins, subqueries, CTEs, window functions, and data aggregation • 🔹 Real-world problem solving Finding duplicates, ranking data, salary insights, and business logic • 🔹 Performance & optimization Indexing, query execution plans, partitioning, and scaling queries • 🔹 Advanced concepts Transactions, ACID properties, triggers, stored procedures, and more 💡 Whether you're starting out or refining your data skills, mastering SQL is what separates basic data handling from true data-driven decision making. 📊 The difference isn’t just writing queries, it’s understanding how data works at scale. 🔁 Save this. Share with your team. Use it as a roadmap. #SQL #DataEngineering #DataAnalytics #Database #TechSkills #Learning #CareerGrowth #BigData #Analytics #Developers #DataScience
Like Comment
To view or add a comment, sign in
Sukriti Ranjan
2w
Report this post
🚀 From Writing SQL Queries → Thinking Like a Data Professional Most SQL problems look easy… until you try to optimize them. Today I worked on a simple problem: 🧠 Problem Statement: Fetch ITEM_NAME and PRICE from SHOP_1 and SHOP_2 where PRICE > 25. 🧩 The obvious solution SELECT ITEM_NAME, PRICE FROM SHOP_1 WHERE PRICE > 25 UNION ALL SELECT ITEM_NAME, PRICE FROM SHOP_2 WHERE PRICE > 25; ✔ Correct ✔ Straightforward But… is it the best way? ⚡ The optimized mindset SELECT ITEM_NAME, PRICE FROM ( SELECT ITEM_NAME, PRICE FROM SHOP_1 UNION ALL SELECT ITEM_NAME, PRICE FROM SHOP_2 ) AS COMBINED WHERE PRICE > 25; 🔍 What changed? Instead of solving the problem… I focused on improving the approach: 🔹 Reduced repeated filtering 🔹 Made it scalable (works for multiple tables) 🔹 Improved readability 💡 Real Learning Writing SQL isn’t just about getting the output. It’s about: 🔹Thinking in sets 🔹Writing scalable logic 🔹Making queries easy to maintain 🏆 Final Thought 👉 Anyone can write a working query. 👉 But strong data analysts write queries that scale. 💬 Curious — would you filter before or after combining data? #SQL #DataAnalytics #DataAnalyst #Learning #InterviewPrep #DataEngineering #Optimization Coding Ninjas Codebasics

1 Comment
Like Comment
To view or add a comment, sign in
Arup Moulick
2w
Report this post
From simple queries to real-world SQL thinking 🚀 ---------------------------------------------------------------- Today I solved a problem where I had to analyze transactions data and report: • Total transactions • Approved transactions • Total amount • Approved amount • Grouped by month and country At first, it looked like a basic aggregation problem… but it actually required combining multiple concepts: ✔ Extracting month from date ✔ Grouping on multiple columns ✔ Conditional aggregation ✔ Writing clean and scalable SQL 🧠 Key learning: Instead of writing multiple queries, everything can be solved in a single query using conditional aggregation. 💡 One powerful trick: Using conditions inside SUM: SUM(state = 'approved') This helped me count approved transactions efficiently. 💻 Solution: SELECT DATE_FORMAT(trans_date, '%Y-%m') AS month, country, COUNT(*) AS trans_count, SUM(state = 'approved') AS approved_count, SUM(amount) AS trans_total_amount, SUM(CASE WHEN state = 'approved' THEN amount ELSE 0 END) AS approved_total_amount FROM Transactions GROUP BY DATE_FORMAT(trans_date, '%Y-%m'), country; 🚀 This problem helped me strengthen: SQL aggregation • Data analysis thinking • Real-world query logic Learning SQL step by step and sharing the journey 👇 #SQL #DataAnalytics #LearningInPublic #LeetCode #100DaysOfCode
Like Comment
To view or add a comment, sign in
Rohit Kumar Singh
4d
Report this post
🚀 SQL QUALIFY Clause – Visual Guide for Real-World Data Analysis Understanding SQL is not just about writing queries — it’s about applying logic in the most efficient way. This visual guide focuses on one of the most powerful yet underrated SQL features: the QUALIFY clause. 🔍 What this covers: 📌 1. What is QUALIFY? QUALIFY is used to filter results generated by window functions like "ROW_NUMBER()", "RANK()", etc. It works after the SELECT stage, making it perfect for analytical queries. 📌 2. SQL Execution Order The diagram clearly shows where QUALIFY fits: 👉 FROM → WHERE → GROUP BY → HAVING → SELECT → QUALIFY → ORDER BY This helps in understanding how SQL actually processes data step-by-step. 📌 3. Syntax Breakdown A clean structure showing how QUALIFY integrates with other clauses — useful for both beginners and interview preparation. 📌 4. Real Dataset Example A sample employees table is used to demonstrate practical scenarios, making learning more relatable and application-based. 📌 5. Example 1 – Top Employee per Department Using "ROW_NUMBER()" with QUALIFY to fetch the highest-paid employee in each department. 📌 6. Example 2 – Top 2 Employees per Department Using "RANK()" to retrieve top performers — a common real-world requirement. 📌 7. Example 3 – Remove Duplicates (Latest Record) A practical use case where QUALIFY helps in deduplication by keeping only the most recent record. 📌 8. WHERE vs HAVING vs QUALIFY A side-by-side comparison to clearly understand when to use each clause. 📌 9. Key Takeaway ✔ Cleaner queries ✔ No need for subqueries ✔ Optimized for analytics workflows 💡 Why this matters? In real-world data analysis, writing optimized and readable queries is a key skill. QUALIFY helps reduce complexity and improves performance when working with window functions. If you're preparing for: 📊 Data Analyst roles 📈 Business Intelligence 💻 SQL Interviews Then mastering QUALIFY can give you a strong edge. --- 📢 Let me know your thoughts in the comments & share if this helped you! #SQL #DataAnalytics #LearnSQL #DataScience #BusinessIntelligence #WindowFunctions #BigQuery #Snowflake #SQLQueries #DataEngineer #Analytics #TechSkills #InterviewPreparation #DataLearning #Coding #CareerGrowth #LinkedInLearning
1 Comment
Like Comment
To view or add a comment, sign in
Chandu Deeti
6d
Report this post
🚨 90% of Developers Still Struggle with SQL Dates… And honestly — it’s NOT because SQL is hard. It’s because most people don’t know the right functions. Here are 9 SQL Date/Time functions that will instantly level up your queries 👇 🧠 The ones you should NEVER ignore: • GETDATE() → Current date & time • DATEADD() → Add/Subtract time (super useful in ETL) • DATEDIFF() → Find gaps between dates • FORMAT() → Make your output readable • ISDATE() → Avoid bad data issues 💡 Real talk: In Data Engineering (SSIS / ETL), 70% of bugs come from wrong date handling. If you master these, you’re already ahead of most developers. 🔥 Pro Tip: Use GETUTCDATE() when working with global systems — saves you from timezone nightmares. --- 📌 Save this post — you’ll need it later 🔁 Share with someone struggling with SQL 💬 Comment “SQL” and I’ll share more advanced tricks #SQL #DataEngineering #ETL #SSIS #Azure #Analytics #LearnSQL #TechCareers #Developers
Like Comment
To view or add a comment, sign in
Aditi Paraskar
3w
Report this post
🚀 Mastering SQL – One Step Closer to Becoming a Data Pro!💥 In today’s data-driven world, SQL is not just a skill — it’s a superpower. 💡 Whether you’re aiming for a career in Data Analysis, Backend Development, or Business Intelligence, understanding SQL is your first big step. Here’s a quick snapshot of what every aspiring data enthusiast should focus on: 🔹 SQL Basics – Understanding databases, tables, rows, and columns 🔹 Data Types – Knowing how data is stored (INT, VARCHAR, DATE, etc.) 🔹 CRUD Operations – The foundation: SELECT, INSERT, UPDATE, DELETE 🔹 Filtering & Sorting – Using WHERE, ORDER BY to get meaningful insights 🔹 Aggregate Functions – COUNT, SUM, AVG, MIN, MAX to analyze data 🔹 Joins – Combining multiple tables like a pro (INNER, LEFT, RIGHT, FULL) 🔹 Subqueries & Aliases – Writing smarter and cleaner queries 🔹 Constraints – Maintaining data integrity (PRIMARY KEY, FOREIGN KEY, etc.) 🔹 Table Operations – CREATE, ALTER, DROP 🔹 Advanced Concepts – Indexes, Views, Stored Procedures & Transactions ✨ Learning SQL is not about memorizing queries — it’s about understanding how data works and how to extract value from it. #SQL #DataAnalytics #LearningJourney #TechSkills #DataScience #StudentLife #CareerGrowth #Database
Like Comment
To view or add a comment, sign in

675 followers

63 Posts

View Profile Connect

SQL Practice: Building a Histogram from Raw Activity Data

More Relevant Posts

Explore related topics

Explore content categories