SQL Challenge: Movie Rating Query with UNION ALL and Subqueries

🗓️ SQL Challenge Day #37: Movie Rating 🔹 Solve two tricky ranking problems in one query! 🎬 🔹 Problem: Part 1️⃣: Find user with most ratings (tie-break: lexicographically smaller name) Part 2️⃣: Find movie with highest avg rating in Feb 2020 (tie-break: lexicographically smaller title) 🔹 Solution (UNION ALL): ( SELECT u.name AS results FROM Users u JOIN ( SELECT user_id, COUNT(user_id) AS c FROM MovieRating GROUP BY user_id ) mr ON u.user_id = mr.user_id ORDER BY c DESC, u.name ASC LIMIT 1 ) UNION ALL ( SELECT m.title AS results FROM Movies m JOIN ( SELECT movie_id, AVG(rating) AS r FROM MovieRating WHERE DATE_FORMAT(created_at, '%Y-%m') = '2020-02' GROUP BY movie_id ) mav ON m.movie_id = mav.movie_id ORDER BY r DESC, m.title ASC LIMIT 1 ); ✅ Result: Accepted 💡 Key Takeaway: **UNION ALL + subqueries** cleanly separates two distinct problems! ⚠️ Critical details: - `DATE_FORMAT(created_at, '%Y-%m') = '2020-02'` isolates February - Dual ordering (`c DESC, name ASC`) handles tie-breaks correctly - Parentheses around each SELECT are mandatory for LIMIT in UNION 👇 Your turn: What’s your strategy for handling multi-part SQL problems? Do you always split them like this? #SQL #LeetCode #DataEngineering #ProblemSolving #Coding #LearningInPublic #Database #DataAnalytics

To view or add a comment, sign in

More Relevant Posts

Sourav Mukherjee
1w
Report this post
✅ Solved a SQL problem on StrataScratch — Day 57 of my SQL Journey 💪 Text looks simple… until you try to count words inside it 👀 Today’s challenge: count exact occurrences of specific words — not substrings, but precise matches. The approach: • Normalised text using LOWER() • Used REGEXP with word boundaries (\b) for exact matching • Replaced matches and compared string lengths • Derived counts using length difference logic • Combined results using UNION What I practised: • REGEXP for pattern matching • String manipulation with LENGTH & REPLACE • Handling edge cases like “bull” vs “bullish” • Translating text problems into SQL logic What stood out — Text data looks simple, But precision changes everything. Small variations completely change the meaning. That’s where careful querying matters. SQL isn’t limited to numbers — It can handle text if you think right. Consistent learning, one query at a time 🚀 #SQL #StrataScratch #DataAnalytics #LearningInPublic #SQLPractice
Like Comment
To view or add a comment, sign in
Sk Md Saif
2w
Report this post
Stop writing repetitive SQL. Start using CTEs. 💡 Ever had to filter a single table, only to realized you need to self-join that same filtered data back to the original? Doing this with nested subqueries is a recipe for The Chaos: ❌ Hard to read logic. ❌ Redundant code that’s a nightmare to maintain. ❌ Performance hits from recalculating the same filters. The better way? Common Table Expressions (CTEs). By defining your subset once at the top, you unlock The Clarity: ✅ Define Once: Your filtering logic lives in one place. ✅ Readability: Your code tells a story, step-by-step. ✅ Efficiency: You join a clean, pre-filtered subset instead of a messy subquery. As the data grows, readability becomes just as important as performance. If you aren't using CTEs for your self-joins yet, this is your sign to start. How do you prefer to handle complex self-joins? Subqueries, CTEs, or Temp Tables? Let’s discuss in the comments! 👇 #SQL #DataEngineering #BusinessIntelligence #Analytics #Database #CodingTips #TechCommunity
Like Comment
To view or add a comment, sign in
Harsh Gupta
1w
Report this post
The Interviewer asked: "Which line of this SQL query runs first?" I pointed at the SELECT statement. He smiled, shook his head, and said: "That’s exactly why your queries are slow." It was a humbling moment, but it taught me the single most important lesson in SQL: How we write code is NOT how the machine reads it. If you want to master performance in 2026, you have to stop thinking like a writer and start thinking like the Query Optimizer. Here is the "Secret Story" of a Query’s life: 1️⃣ FROM & JOIN: The engine first goes to the warehouse to find the tables. It doesn't care what you want to "select" yet—it just needs the raw data. 2️⃣ WHERE: It filters the rows before doing any heavy lifting. This is where you save (or waste) money. 3️⃣ GROUP BY & HAVING: It aggregates the data and then filters those groups. 4️⃣ SELECT: Only NOW does it pick the columns you actually asked for. 5️⃣ ORDER BY & LIMIT: Finally, it sorts the result and gives you the top rows. When you put a heavy calculation in the SELECT but filter it in the WHERE, the engine has to work twice as hard if you don't understand this order. 👇 Have you ever been "tricked" by this in an interview? Or worse... in a production environment? #SQL #DataAnalytics #InterviewPrep #CodingLife #Database #QueryOptimization #MicrosoftFabric #2026Tech
2 Comments
Like Comment
To view or add a comment, sign in
Sourav Mukherjee
2w
Report this post
✅ Solved a SQL problem on LeetCode — Day 49 of my SQL Journey 💪 Consistency isn’t always visible… But patterns reveal it over time 📈 Today’s problem was about identifying users with persistent behaviour — not just active users, but those who show the same action consistently across days. I worked on analysing behaviour patterns to: • Track user actions across consecutive days • Group continuous activity using ROW_NUMBER() logic • Identify streaks by adjusting date gaps • Filter users with meaningful streak length (≥ 5 days) • Select the strongest pattern per user using RANK() What I practised: • Window functions for streak detection • Handling sequential data with date logic • GROUP BY + HAVING for filtering patterns • Translating consistency into measurable conditions What stood out — Activity can be random… Consistency is never random. When actions repeat over time, They stop being noise and become behaviour. That’s where real insights start. SQL doesn’t just analyse data. It uncovers patterns hidden in time. Consistent learning, one query at a time 🚀 #SQL #LeetCode #SQLPractice #DataAnalytics #LearningInPublic
Like Comment
To view or add a comment, sign in
Tamas Szabo
3w
Report this post
I've read a lot of SQL written by smart people. Most of it is unreadable. Not because the logic is wrong. But three months later, nobody can tell you why it does what it does. Including the person who wrote it. After 8 years of writing SQL daily, here's the system I use to keep queries readable, debuggable, and easy to hand off: 🔢 Step 0: Check the row count before writing anything Run SELECT SUM(1) FROM main_table and put the number in a comment at the top. This one habit has saved me from fan-out disasters more times than I can count. 🔗 Always use explicit JOINs Never connect tables through the WHERE clause. The join logic and the filter logic should live in separate, predictable places. And always default to LEFT JOIN — INNER JOIN silently drops rows, which silently corrupts your results. 🏷️ Meaningful aliases, not alphabet soup If you're aliasing tables a, b, c — your colleagues are not thanking you. Two or three descriptive characters (ai, sf, pc) is all you need. 🧱 One CTE, one job Break complex logic into named CTEs. Each one does exactly one thing. Structure them Source → Filtered → Aggregated → Final. You can read it like a story. 💬 Comment the why, not the what The code shows what's happening. Comments should explain why a decision was made — the business rules, the edge cases, the intentional exclusions. Readable SQL is a form of communication. It signals you're thinking about the person who comes after you, not just getting the right answer today. I wrote up the full breakdown with real code examples on the blog — link in the comments. #SQL #DataAnalytics #DataEngineering #BI #Analytics #BestPractices #DataAnalysis #DataFam

11 Comments
Like Comment
To view or add a comment, sign in
Sneha B
2w
Report this post
Solved the “The PADS” SQL Challenge on HackerRank today! This problem was a great combination of string manipulation + aggregation + sorting logic — exactly the kind of thinking required for real-world data analysis 💡 Generate the following two result sets: Query an alphabetically ordered list of all names in OCCUPATIONS, immediately followed by the first letter of each profession as a parenthetical (i.e.: enclosed in parentheses). For example: AnActorName(A), ADoctorName(D), AProfessorName(P), and ASingerName(S). 🔹 Part 1: Formatting Names with Occupations Used CASE along with string concatenation (||) to attach the first letter of each occupation to the name. Example output: 👉 Samantha(D), Julia(A), Maria(P) ✔ Key concepts used: CASE WHEN SUBSTRING / first character extraction String concatenation Query the number of ocurrences of each occupation in OCCUPATIONS. Sort the occurrences in ascending order, and output them in the following format: There are a total of [occupation_count] [occupation]s. where [occupation_count] is the number of occurrences of an occupation in OCCUPATIONS and [occupation] is the lowercase occupation name. If more than one Occupation has the same [occupation_count], they should be ordered alphabetically. Note: There will be at least two entries in the table for each type of occupation. 🔹 Part 2: Counting Occupations Used COUNT(*) with GROUP BY to calculate total occurrences of each occupation and formatted the output into readable sentences. Example: 👉 There are a total of 3 doctors. ✔ Key concepts used: GROUP BY COUNT(*) LOWER() for formatting ORDER BY for sorting results 💡 Learning takeaway: This challenge reinforced how SQL is not just about querying data, but also about presenting it in a meaningful and readable format. Consistency in solving such problems is helping me strengthen my foundation step by step 📊 #SQL #HackerRank #DataAnalytics #LearningJourney #WomenInTech #PracticeMakesPerfect #Upskilling #FutureDataAnalyst
Like Comment
To view or add a comment, sign in
Aman Gambhir
2w
Report this post
𝗖𝗼𝗺𝗺𝗼𝗻 𝗦𝗤𝗟 𝗠𝗶𝘀𝘁𝗮𝗸𝗲𝘀 (𝗮𝗻𝗱 𝗛𝗼𝘄 𝘁𝗼 𝗙𝗶𝘅 𝗧𝗵𝗲𝗺) Over time, I’ve seen a few SQL mistakes that can silently break logic or performance. Here are some common ones and how to avoid them: 1. 𝗙𝗼𝗿𝗴𝗲𝘁𝘁𝗶𝗻𝗴 𝘁𝗵𝗲 𝗪𝗛𝗘𝗥𝗘 𝗖𝗹𝗮𝘂𝘀𝗲 Running 𝗗𝗘𝗟𝗘𝗧𝗘 or 𝗨𝗣𝗗𝗔𝗧𝗘 without a 𝗪𝗛𝗘𝗥𝗘 clause can wipe out entire tables. Always double-check your conditions and use transactions when working with critical data. One small miss can lead to massive data loss. 2. 𝗢𝘃𝗲𝗿𝘂𝘀𝗶𝗻𝗴 𝗦𝗘𝗟𝗘𝗖𝗧 * Using 𝗦𝗘𝗟𝗘𝗖𝗧 * fetches unnecessary columns, slows down queries, and makes code less readable. Instead, select only the columns you need—it improves performance and keeps queries future-proof. 3. 𝗖𝗼𝗺𝗽𝗮𝗿𝗶𝗻𝗴 𝘄𝗶𝘁𝗵 𝗡𝗨𝗟𝗟 𝗜𝗻𝗰𝗼𝗿𝗿𝗲𝗰𝘁𝗹𝘆 𝗡𝗨𝗟𝗟 is not a value, so = 𝗡𝗨𝗟𝗟 won’t work. Always use 𝗜𝗦 𝗡𝗨𝗟𝗟 or 𝗜𝗦 𝗡𝗢𝗧 𝗡𝗨𝗟𝗟. This ensures correct filtering and avoids unexpected empty results. 4. 𝗚𝗿𝗼𝘂𝗽𝗶𝗻𝗴 𝗜𝘀𝘀𝘂𝗲𝘀 𝗶𝗻 𝗦𝗘𝗟𝗘𝗖𝗧 Every non-aggregated column in your 𝗦𝗘𝗟𝗘𝗖𝗧 must be in the 𝗚𝗥𝗢𝗨𝗣 𝗕𝗬. Ignoring this leads to errors or incorrect results. Follow SQL standards for clean and accurate aggregation. 5. 𝗜𝗻𝗰𝗼𝗿𝗿𝗲𝗰𝘁 𝗚𝗥𝗢𝗨𝗣 𝗕𝗬 𝗨𝘀𝗮𝗴𝗲 Grouping without proper structure can make your results confusing. Use meaningful groupings and ensure your query clearly reflects the business logic behind the data. 6. 𝗠𝗶𝘀𝘀𝗶𝗻𝗴 𝗣𝗮𝗿𝗲𝗻𝘁𝗵𝗲𝘀𝗲𝘀 𝗶𝗻 𝗖𝗼𝗺𝗽𝗹𝗲𝘅 𝗟𝗼𝗴𝗶𝗰 When combining 𝗔𝗡𝗗 and 𝗢𝗥, operator precedence can change results. Always use parentheses to define logic explicitly; it improves readability and prevents logical bugs. 💡 𝗙𝗶𝗻𝗮𝗹 𝗧𝗵𝗼𝘂𝗴𝗵𝘁: Small SQL mistakes can lead to big data issues. Writing clean, intentional queries is just as important as getting the result. If you’ve faced similar issues, I would love to hear your experiences 👇 Follow Aman Gambhir for more content like this. #SQL #sqltips #sqlquery #query #sqlmistakes #optimization
1 Comment
Like Comment
To view or add a comment, sign in
Sourav Mukherjee
4w
Report this post
✅ Solved a SQL problem on LeetCode — Day 40 of my SQL Journey 💪 Not every customer leaves suddenly… Some show signs before they churn. ⚠️ Today’s problem was about identifying churn risk customers — users who are still active but showing warning behaviour. I used aggregation and event analysis to: • Track each user’s latest subscription status • Identify downgrade behaviour over time • Compare current spend with historical maximum • Measure total subscription duration • Filter users based on combined risk signals What I practised: • Working with event-based data using GROUP BY • Using CASE WHEN to capture behavioural signals • Extracting latest values with ordered aggregation • Applying multiple conditions to detect patterns What stood out — Churn doesn’t happen instantly… it builds up through small changes. A downgrade here, a drop in spending there. That’s where the real insight lies. SQL isn’t just about analysing what happened. It’s about spotting what might happen next. Consistent learning, one query at a time 🚀 #SQL #LeetCode #SQLPractice #DataAnalytics #LearningInPublic
2 Comments
Like Comment
To view or add a comment, sign in
Mohd Jeeshan
4w Edited
Report this post
I’ve been diving deep into SQL lately, building a full set of queries from basics to advanced topics. Here’s a sneak peek of what I’ve worked on 👇 1. Database & Table creation– `CREATE DATABASE` & `CREATE TABLE` 2. Basic SELECT & Filterin– `SELECT`, `WHERE`, `ORDER BY`, `LIMIT 3. Conditional Logic & Patterns– `AND/OR`, `IN`, `LIKE`, `CASE` 4. Aggregations & Grouping– `COUNT`, `SUM`, `AVG`, `MIN/MAX`, `HAVING` 5. Joins & Relationships – `INNER JOIN`, `LEFT JOIN`, `RIGHT JOIN`, `FULL JOIN`, `SELF JOIN` 6. Subqueries & Pagination– Nested queries, `LIMIT OFFSET` 7. Window Functions– `ROW_NUMBER()`, `RANK()`, `DENSE_RANK()`, `PARTITION BY`, `SUM/AVG OVER()`, `LAG`, `LEAD` This project helped me understand data filtering, aggregation, relational joins, and advanced analytics using window functions.. Always open to feedback and suggestions from SQL pros! #SQL #DataAnalytics #Database #Learning #Tech #Coding #SQLPractice
Like Comment
To view or add a comment, sign in
Danial raza
3w
Report this post
import pandas as pd from helpers import calculate_total, format_currency import os # Setup path script_dir = os.path.dirname(os.path.abspath(__file__)) data_path = os.path.join(script_dir, 'data', 'sales.csv') # Read data df = pd.read_csv(data_path) # DEBUG: Check actual column names print("Actual column names:", df.columns.tolist()) # FIX: Remove extra spaces from column names df.columns = df.columns.str.strip() # Now force columns to be numeric df['quantity'] = pd.to_numeric(df['quantity']) df['price'] = pd.to_numeric(df['price']) # Calculate totals totals = [] for index, row in df.iterrows(): total = calculate_total(row['quantity'], row['price']) totals.append(total) df['total'] = totals # Display results print("\nSales Data:") for index, row in df.iterrows(): formatted_total = format_currency(row['total']) print(f"{row['product']}: {formatted_total}") grand_total = df['total'].sum() print(f"\nGrand Total: {format_currency(grand_total)}")
Like Comment
To view or add a comment, sign in

2,898 followers

38 Posts

View Profile Connect

SQL Challenge: Movie Rating Query with UNION ALL and Subqueries

More Relevant Posts

Explore content categories