Day 7/30 — 5 SQL queries I use every single day as a Data Analyst. Not textbook queries. The ones I actually run. On real data. Every week. Save this. You'll need it. 🔖 1️⃣ Finding duplicates instantly sql SELECT customer_id, COUNT(*) as count FROM orders GROUP BY customer_id HAVING COUNT(*) > 1 ORDER BY count DESC; Before building ANY dashboard — this is query #1. Duplicates in source data = wrong KPIs. Always check. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2️⃣ Running totals with window functions sql SELECT order_date, revenue, SUM(revenue) OVER (ORDER BY order_date) AS running_total FROM sales; No subquery. No JOIN. One clean line. Stakeholders love a running total visual. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 3️⃣ Month-over-month comparison sql SELECT month, revenue, LAG(revenue) OVER (ORDER BY month) AS prev_month, ROUND((revenue - LAG(revenue) OVER (ORDER BY month)) / LAG(revenue) OVER (ORDER BY month) * 100, 2) AS MoM_growth FROM monthly_sales; This one query powers half my Power BI KPI cards. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4️⃣ Replacing NULLs without breaking aggregations sql SELECT COALESCE(region, 'Unknown') AS region, COALESCE(revenue, 0) AS revenue FROM sales_data; NULL values silently break SUM and AVG. COALESCE is your safety net. Use it everywhere. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 5️⃣ Top N per category sql SELECT * FROM ( SELECT *, ROW_NUMBER() OVER (PARTITION BY region ORDER BY revenue DESC) AS rank FROM sales ) ranked WHERE rank <= 3; Top 3 products per region. Top 5 customers per segment. This pattern works for everything. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ These 5 queries cover 80% of real analyst work. Master these before memorising anything else. 💪 Day 7 of my #7DaysOfData series. One practical insight every day — follow along 🔔 💬 Which of these do you use most? Or is there a query you'd add to this list? Drop it below 👇 #30DaysOfData #SQL #DataAnalytics #DataAnalyst #SQLtips #DataEngineering #WindowFunctions #BIwithPankhuri
5 Essential SQL Queries for Data Analysts
More Relevant Posts
-
💬 SQL Challenge of the Day Problem: You are given a table named "sales_data" with the following columns: - order_id (unique identifier for each order) - order_date (date of the order) - revenue (amount of revenue generated by the order) Write a SQL query to calculate the total revenue for each month, along with the running total revenue for each month in descending order of total revenue. Query: ```sql WITH monthly_revenue AS ( SELECT DATE_TRUNC('month', order_date) AS month_start, SUM(revenue) AS total_revenue, ROW_NUMBER() OVER (PARTITION BY DATE_TRUNC('month', order_date) ORDER BY order_date) AS rn FROM sales_data GROUP BY DATE_TRUNC('month', order_date) ) SELECT month_start, total_revenue, SUM(total_revenue) OVER (ORDER BY month_start ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_total FROM monthly_revenue ORDER BY total_revenue DESC; ``` Answer: The above query will calculate the total revenue for each month and provide the running total revenue for each month in descending order of total revenue. Explanation: - The query first calculates the total revenue for each month using the `DATE_TRUNC` function to group by month. - It then uses a window function `ROW_NUMBER` to assign a row number within each month based on the order date. - Finally, the query calculates the running total revenue using the `SUM` window function over the ordered months. Example: Consider the "sales_data" table: | order_id | order_date | revenue | |----------|------------|---------| | 1 | 2022-01-05 | 100 | | 2 | 2022-01-15 | 150 | | 3 | 2022-02-10 | 200 | | 4 | 2022-02-20 | 180 | The query will output: | month_start | total_revenue | running_total | |-------------|---------------|---------------| | 2022-01-01 | 250 | 250 | | 2022-02-01 | 380 | 630 | #Hashtags #PowerBIChallenge #PowerInterview #LearnPowerBi #LearnSQL #TechJobs #DataAnalytics #DataScience #BigData #DataAnalyst #MachineLearning #Python #SQL #Tableau #DataVisualization #DataEngineering #ArtificialIntelligence #CloudComputing #BusinessIntelligence #Data
To view or add a comment, sign in
-
💬 SQL Challenge of the Day Problem: You are given a table named "sales_data" with the following columns: - order_id: The unique identifier of an order - order_date: The date when the order was placed - revenue: The amount of revenue generated by the order Write a SQL query to calculate the 7-day rolling average revenue for each order, considering only the orders from the previous 7 days (including the current day). Query: ```sql SELECT order_id, order_date, revenue, AVG(revenue) OVER ( ORDER BY order_date RANGE BETWEEN INTERVAL '6' DAY PRECEDING AND CURRENT ROW ) AS rolling_avg_revenue FROM sales_data; ``` Answer: The SQL query calculates the 7-day rolling average revenue for each order, considering only the orders from the previous 7 days (including the current day). Explanation: - The query uses a window function with the AVG() function to calculate the rolling average revenue. - The OVER clause is used to define the window frame as the range between 6 days before the current row and the current row. - This allows us to calculate the average revenue over a moving 7-day window for each order. Example: Consider the "sales_data" table: | order_id | order_date | revenue | |----------|------------|---------| | 1 | 2022-01-01 | 100 | | 2 | 2022-01-02 | 150 | | 3 | 2022-01-03 | 200 | | 4 | 2022-01-04 | 180 | | 5 | 2022-01-05 | 220 | The query will output: | order_id | order_date | revenue | rolling_avg_revenue | |----------|------------|---------|---------------------| | 1 | 2022-01-01 | 100 | 100.00 | | 2 | 2022-01-02 | 150 | 125.00 | | 3 | 2022-01-03 | 200 | 150.00 | | 4 | 2022-01-04 | 180 | 157.50 | | 5 | 2022-01-05 | 220 | 170.00 | #Hashtags #PowerBIChallenge #PowerInterview #LearnPowerBi #LearnSQL #TechJobs #DataAnalytics #DataScience #BigData #DataAnalyst #MachineLearning #Python #SQL #Tableau #DataVisualization #DataEngineering #ArtificialIntelligence #CloudComputing #BusinessIntelligence #Data
To view or add a comment, sign in
-
💬 SQL Challenge of the Day Problem: You are given a table named "sales_data" with the following columns: - order_id: unique identifier for each order - order_date: date when the order was placed - product_id: unique identifier for each product - quantity: the quantity of the product ordered - revenue: the revenue generated from the order Write a SQL query to calculate the running total revenue for each day, considering all previous days. The result should include all columns from the original table plus an additional column for the running total revenue. Query: ```sql SELECT order_id, order_date, product_id, quantity, revenue, SUM(revenue) OVER (PARTITION BY order_date ORDER BY order_id) AS running_total_revenue FROM sales_data ORDER BY order_date, order_id; ``` Answer: The SQL query provided will calculate the running total revenue for each day, considering all previous days. Explanation: - The query uses a window function `SUM()` with `PARTITION BY order_date` to calculate the running total revenue for each day. - The `ORDER BY order_id` within the window function ensures that the running total is calculated based on the order of the orders. Example: Consider the "sales_data" table: | order_id | order_date | product_id | quantity | revenue | |----------|------------|------------|----------|---------| | 1 | 2022-01-01 | A | 2 | 100 | | 2 | 2022-01-01 | B | 1 | 50 | | 3 | 2022-01-02 | A | 1 | 60 | | 4 | 2022-01-03 | C | 3 | 120 | The output of the query will be: | order_id | order_date | product_id | quantity | revenue | running_total_revenue | |----------|------------|------------|----------|---------|-----------------------| | 1 | 2022-01-01 | A | 2 | 100 | 100 | | 2 | 2022-01-01 | B | 1 | 50 | 150 | | 3 | 2022-01-02 | A | 1 | 60 | 60 | | 4 | 2022-01-03 | C | 3 | 120 | 120 | #Hashtags #PowerBIChallenge #PowerInterview #LearnPowerBi #LearnSQL #TechJobs #DataAnalytics #DataScience #BigData #DataAnalyst #MachineLearning #Python #SQL #Tableau #DataVisualization #DataEngineering #ArtificialIntelligence #CloudComputing #BusinessIntelligence #Data
To view or add a comment, sign in
-
💬 SQL Challenge of the Day Problem: You are given a table named "sales_data" with the following columns: - order_id (unique identifier for each order) - order_date (date of the order) - product_id (unique identifier for each product) - quantity (the quantity of the product ordered) - price (the price of one unit of the product) Write a SQL query to calculate the cumulative revenue for each product over time, considering all previous orders, ordered by the product_id and order_date. Query: ```sql SELECT order_date, product_id, quantity, price, SUM(quantity * price) OVER (PARTITION BY product_id ORDER BY order_date) AS cumulative_revenue FROM sales_data ``` Answer: The SQL query calculates the cumulative revenue for each product over time by considering all previous orders. It uses a window function to sum the product of quantity and price for each row partitioned by product_id and ordered by order_date. Explanation: - The query uses the `SUM()` window function along with the `OVER` clause to calculate the cumulative revenue. - The `PARTITION BY product_id` ensures that the sum is reset for each product_id. - The `ORDER BY order_date` specifies the ordering within each partition. Example: Consider the following "sales_data" table: | order_id | order_date | product_id | quantity | price | |----------|------------|------------|----------|-------| | 1 | 2022-01-01 | A | 2 | 10 | | 2 | 2022-01-03 | B | 1 | 20 | | 3 | 2022-01-05 | A | 3 | 10 | | 4 | 2022-01-07 | A | 1 | 10 | The query will output: | order_date | product_id | quantity | price | cumulative_revenue | |------------|------------|----------|-------|--------------------| | 2022-01-01 | A | 2 | 10 | 20 | | 2022-01-03 | B | 1 | 20 | 20 | | 2022-01-05 | A | 3 | 10 | 50 | | 2022-01-07 | A | 1 | 10 | 60 | #Hashtags #PowerBIChallenge #PowerInterview #LearnPowerBi #LearnSQL #TechJobs #DataAnalytics #DataScience #BigData #DataAnalyst #MachineLearning #Python #SQL #Tableau #DataVisualization #DataEngineering #ArtificialIntelligence #CloudComputing #BusinessIntelligence #Data
To view or add a comment, sign in
-
💬 SQL Challenge of the Day Problem: You are given a table named "sales_data" with the following columns: - order_id: The unique identifier of the order - order_date: The date when the order was placed - product_id: The unique identifier of the product - quantity: The quantity of the product ordered - price: The price of each unit of the product Write a SQL query to calculate the total sales for each month, considering both quantity and price. The result should include the month and year of the order along with the total sales amount. Query: ```sql SELECT DATE_FORMAT(order_date, '%Y-%m') AS month_year, SUM(quantity * price) AS total_sales FROM sales_data GROUP BY DATE_FORMAT(order_date, '%Y-%m') ORDER BY month_year; ``` Answer: The SQL query calculates the total sales for each month by multiplying the quantity of each product by its price and summing up the results. It then groups the results by the month and year of the order. Explanation: - We use the `DATE_FORMAT` function to extract the month and year from the order_date column in the format 'YYYY-MM'. - The `SUM(quantity * price)` calculates the total sales amount by multiplying the quantity and price of each product. - The `GROUP BY` clause groups the results by the month and year extracted from the order_date. - Finally, we order the results by the month and year. Example: Consider the following "sales_data" table: | order_id | order_date | product_id | quantity | price | |----------|------------|------------|----------|-------| | 1 | 2022-01-10 | 101 | 2 | 10 | | 2 | 2022-01-15 | 102 | 3 | 15 | | 3 | 2022-02-05 | 103 | 1 | 20 | | 4 | 2022-02-20 | 104 | 2 | 25 | The result of the query would be: | month_year | total_sales | |------------|-------------| | 2022-01 | 65 | | 2022-02 | 70 | #Hashtags #PowerBIChallenge #PowerInterview #LearnPowerBi #LearnSQL #TechJobs #DataAnalytics #DataScience #BigData #DataAnalyst #MachineLearning #Python #SQL #Tableau #DataVisualization #DataEngineering #ArtificialIntelligence #CloudComputing #BusinessIntelligence #Data
To view or add a comment, sign in
-
💬 SQL Challenge of the Day Problem: You are given a table named "orders" with the following columns: - order_id: unique identifier for each order - customer_id: unique identifier for each customer - order_date: date the order was placed - total_amount: the total amount of the order - country: the country where the order was placed Write a SQL query to calculate the running total of the total_amount for each customer, ordered by order_date, within each country. Include the order_id in the result set. Query: ```sql SELECT order_id, customer_id, order_date, total_amount, country, SUM(total_amount) OVER (PARTITION BY customer_id, country ORDER BY order_date) AS running_total FROM orders ``` Answer: The SQL query to calculate the running total of the total_amount for each customer, ordered by order_date, within each country is provided below: ```sql SELECT order_id, customer_id, order_date, total_amount, country, SUM(total_amount) OVER (PARTITION BY customer_id, country ORDER BY order_date) AS running_total FROM orders ``` Explanation: This query uses a window function with the PARTITION BY clause to calculate the running total of total_amount for each customer within each country, ordered by order_date. 🛠️ Example: Consider the following "orders" table: | order_id | customer_id | order_date | total_amount | country | |----------|-------------|------------|--------------|---------| | 1 | 101 | 2021-01-01 | 50 | USA | | 2 | 102 | 2021-01-02 | 30 | Canada | | 3 | 101 | 2021-01-03 | 70 | USA | | 4 | 103 | 2021-01-04 | 40 | USA | The result of the query would be: | order_id | customer_id | order_date | total_amount | country | running_total | |----------|-------------|------------|--------------|---------|---------------| | 1 | 101 | 2021-01-01 | 50 | USA | 50 | | 3 | 101 | 2021-01-03 | 70 | USA | 120 | | 4 | 103 | 2021-01-04 | 40 | USA | 40 | #Hashtags #PowerBIChallenge #PowerInterview #LearnPowerBi #LearnSQL #TechJobs #DataAnalytics #DataScience #BigData #DataAnalyst #MachineLearning #Python #SQL #Tableau #DataVisualization #DataEngineering #ArtificialIntelligence #CloudComputing #BusinessIntelligence #Data
To view or add a comment, sign in
-
💬 SQL Challenge of the Day Problem: Given a table "sales" with columns (order_id, product_id, quantity, order_date), write a SQL query to calculate the cumulative sum of quantity for each product_id ordered, resetting the sum when encountering a different product_id, and order the results by product_id and order_date. Query: ```sql SELECT order_id, product_id, quantity, order_date, SUM(quantity) OVER (PARTITION BY product_id ORDER BY order_date) AS cumulative_sum FROM sales ``` Answer: The provided SQL query calculates the cumulative sum of quantity for each product_id ordered in the "sales" table, resetting the sum when encountering a different product_id. The results are ordered by product_id and order_date. Explanation: The query uses a window function with the PARTITION BY clause to calculate the cumulative sum of quantity for each product_id. The ORDER BY clause within the window function ensures that the sum is calculated based on the order_date. This way, the cumulative sum resets when the product_id changes. Example: Consider the "sales" table: | order_id | product_id | quantity | order_date | |----------|------------|----------|------------| | 1 | A | 10 | 2022-01-01 | | 2 | A | 15 | 2022-01-02 | | 3 | B | 20 | 2022-01-01 | | 4 | A | 5 | 2022-01-03 | The query will output: | order_id | product_id | quantity | order_date | cumulative_sum | |----------|------------|----------|------------|----------------| | 1 | A | 10 | 2022-01-01 | 10 | | 2 | A | 15 | 2022-01-02 | 25 | | 4 | A | 5 | 2022-01-03 | 30 | | 3 | B | 20 | 2022-01-01 | 20 | #Hashtags #PowerBIChallenge #PowerInterview #LearnPowerBi #LearnSQL #TechJobs #DataAnalytics #DataScience #BigData #DataAnalyst #MachineLearning #Python #SQL #Tableau #DataVisualization #DataEngineering #ArtificialIntelligence #CloudComputing #BusinessIntelligence #Data
To view or add a comment, sign in
-
Day 10 of my 30 days business analytics challenge. SQL Joins. And this one genuinely felt like a level up. One table tells you what happened. Joins tell you the whole story. Here is the problem with looking at just one table. You have an orders table. Great. You know order IDs, dates, amounts. But you do not know who placed those orders, where they live, or what product category they bought from, because that information lives in completely separate tables. So your analysis is always half-finished. Joins fix that. They let you connect tables using a shared key, usually something like customer_id and pull a complete picture into one query. Customer name. Order date. Product category. Region. Revenue. All in one result. All from one question. The query that made it click for me today: SELECT c.customer_name, c.region, SUM(o.sales) AS total_spent FROM customers c INNER JOIN orders o ON c.customer_id = o.customer_id GROUP BY c.customer_name, c.region ORDER BY total_spent DESC Before this query I had a list of orders with no names. After it I had a ranked list of my highest-value customers by region. That is a sales team conversation. Written in five lines. But here is the part that nobody really explains well at first. INNER JOIN and LEFT JOIN are not just technical variations. They answer different business questions. INNER JOIN - show me customers who have placed orders. LEFT JOIN - show me all customers, including the ones who have never ordered. That second one sounds small. It is not. That second query finds your inactive customers. The people who signed up, never bought, and are silently sitting in your database. That is a retention problem. That is a marketing campaign waiting to happen. That is revenue being left on the table every single month. A LEFT JOIN that returns NULL values in the order columns is not a data gap. It is a business insight. Ten days in and this is the pattern I keep noticing. Every SQL concept I learn is just a new way of asking a business question. The syntax is almost secondary. What matters is understanding what you are actually looking for and why it matters to the person who needs to make a decision. Day 10 - 10 join queries done, multi-table logic added to the Project. INNER JOIN or LEFT JOIN, which one confused you more when you first learned it? #SQLJoins #BusinessAnalyst #BAJourney #DataAnalytics #SQL #AnalyticsSkills #CareerGrowth #DataSkills
To view or add a comment, sign in
-
💬 SQL Challenge of the Day Problem: You are given a table named "sales_data" with the following columns: - order_id (integer) - sale_date (date) - amount (decimal) Write a SQL query to calculate the cumulative sum of sales amount for each month, starting from the first month of sales for each order. Query: ```sql SELECT order_id, sale_date, amount, SUM(amount) OVER (PARTITION BY order_id ORDER BY sale_date) AS cumulative_sum FROM sales_data; ``` Answer: The SQL query calculates the cumulative sum of the sales amount for each order, starting from the first month of sales for each order. Explanation: - The query uses a window function with the `SUM` function to calculate the cumulative sum. - It partitions the data by the order_id and orders the data by the sale_date within each partition. - This allows us to calculate the cumulative sum starting from the first month of sales for each order. Example: Consider the following "sales_data" table: | order_id | sale_date | amount | |----------|-----------|--------| | 1 | 2022-01-15| 100.00 | | 1 | 2022-02-20| 150.00 | | 2 | 2022-01-10| 200.00 | | 2 | 2022-03-05| 300.00 | The query will output: | order_id | sale_date | amount | cumulative_sum | |----------|-----------|--------|----------------| | 1 | 2022-01-15| 100.00 | 100.00 | | 1 | 2022-02-20| 150.00 | 250.00 | | 2 | 2022-01-10| 200.00 | 200.00 | | 2 | 2022-03-05| 300.00 | 500.00 | In the result, the cumulative_sum column shows the cumulative sum of sales amount for each order starting from the first month of sales. #Hashtags #PowerBIChallenge #PowerInterview #LearnPowerBi #LearnSQL #TechJobs #DataAnalytics #DataScience #BigData #DataAnalyst #MachineLearning #Python #SQL #Tableau #DataVisualization #DataEngineering #ArtificialIntelligence #CloudComputing #BusinessIntelligence #Data
To view or add a comment, sign in
-
💬 SQL Challenge of the Day Problem: You have a table named "sales_data" containing information about sales transactions. Each row represents a single transaction with columns: transaction_id, product_id, sale_amount, and transaction_date. Write a SQL query to calculate the cumulative sum of sales_amount for each product_id, ordered by transaction_date, resetting the sum when encountering a new product_id. Query: ```sql SELECT transaction_id, product_id, sale_amount, SUM(sale_amount) OVER(PARTITION BY product_id ORDER BY transaction_date) AS cumulative_sum FROM sales_data ``` Answer: The SQL query calculates the cumulative sum of sale_amount for each product_id, resetting the sum when a new product_id is encountered, and orders the results by transaction_date. Explanation: The query uses a window function with the PARTITION BY clause to calculate the cumulative sum of sale_amount for each product_id. It resets the sum when a new product_id is encountered due to the PARTITION BY clause. The results are ordered by transaction_date to show the cumulative sum in chronological order. Example: Assume the "sales_data" table has the following data: | transaction_id | product_id | sale_amount | transaction_date | |----------------|------------|-------------|------------------| | 1 | A | 100 | 2022-01-01 | | 2 | A | 150 | 2022-01-03 | | 3 | B | 200 | 2022-01-02 | | 4 | A | 120 | 2022-01-05 | The query will output: | transaction_id | product_id | sale_amount | cumulative_sum | |----------------|------------|-------------|----------------| | 1 | A | 100 | 100 | | 2 | A | 150 | 250 | | 4 | A | 120 | 120 | | 3 | B | 200 | 200 | #Hashtags #PowerBIChallenge #PowerInterview #LearnPowerBi #LearnSQL #TechJobs #DataAnalytics #DataScience #BigData #DataAnalyst #MachineLearning #Python #SQL #Tableau #DataVisualization #DataEngineering #ArtificialIntelligence #CloudComputing #BusinessIntelligence #Data
To view or add a comment, sign in
Explore content categories
- Career
- Productivity
- Finance
- Soft Skills & Emotional Intelligence
- Project Management
- Education
- Technology
- Leadership
- Ecommerce
- User Experience
- Recruitment & HR
- Customer Experience
- Real Estate
- Marketing
- Sales
- Retail & Merchandising
- Science
- Supply Chain Management
- Future Of Work
- Consulting
- Writing
- Economics
- Artificial Intelligence
- Employee Experience
- Workplace Trends
- Fundraising
- Networking
- Corporate Social Responsibility
- Negotiation
- Communication
- Engineering
- Hospitality & Tourism
- Business Strategy
- Change Management
- Organizational Culture
- Design
- Innovation
- Event Planning
- Training & Development