Master SQL for Data Science and Business Analyst Interviews
Introduction
What do Meta, Google, Uber, Revolut, Amazon all have in common? They all have SQL rounds in Data Science and Business Analyst interviews. Mastering SQL is a crucial step on your journey. In this comprehensive guide, you will discover the SQL skills you need to succeed in these competitive fields.
Know the Most Common Functions
Filtering Your Data
WHERE
Filters rows based on a given condition.
SELECT *
FROM users
WHERE age > 25;
Use-case: Select records of users who are older than 25.
AND & OR
Used to combine multiple conditions.
# Use both the conditions together
SELECT *
FROM users
WHERE age > 25
AND gender = 'male';
# At least one of the condition has to be true
SELECT *
FROM users
WHERE age > 25
OR age < 18;
Use-case: Filter users based on multiple conditions like age and gender.
NOT
Negates a condition.
SELECT * FROM users WHERE NOT age > 25;
Use-case: Find users who are 25 years old or younger.
Ranged Data Filtering
BETWEEN
Filters rows within a range.
SELECT * FROM users WHERE age BETWEEN 18 AND 25;
Use-case: Find users whose ages fall between 18 and 25 (inclusive).
ANY & ALL
Used with subqueries to compare values.
SELECT *
FROM users
WHERE age = ANY
(SELECT age
FROM users
WHERE gender = 'female');
SELECT *
FROM users
WHERE age = ALL
(SELECT age
FROM users
WHERE gender = 'female');
Use-case: Filter data based on conditions met in a subquery.
EXISTS
Checks for the existence of rows in a subquery.
SELECT *
FROM users
WHERE EXISTS
(SELECT *
FROM orders
WHERE users.id = orders.user_id);
Use-case: Find users who have placed at least one order.
Grouping and Summarizing Your Data
GROUP BY
Groups rows that have the same values in specified columns.
SELECT
country,
COUNT(*)
FROM users
GROUP BY country;
Use-case: Get a count of users from each country.
Aggregation Functions
AVG, COUNT, MAX, MIN, SUM
SELECT
country,
AVG(age),
COUNT(id),
MAX(height),
MIN(weight),
SUM(purchase_amount)
FROM transactions
GROUP BY country
HAVING COUNT(id) >= 100;
Use-case: Generate summary statistics for transactions grouped by country.
HAVING
Filters groups based on a condition.
SELECT
country,
COUNT(*)
FROM users
GROUP BY country
HAVING COUNT(*) > 10;
Use-case: Retrieve countries that have more than 10 users.
String Operations
Recommended by LinkedIn
SUBSTRING
Extracts part of a string.
SELECT
SUBSTRING('Hello, world!', 7, 5)
AS greeting;
Use-case: Extract the word 'world' from the text 'Hello, world!'.
LOWER & UPPER
Converts a string to lowercase or uppercase.
SELECT
LOWER('HELLO, WORLD!') AS greeting;
SELECT UPPER('hello, world!') AS greeting;
Use-case: Standardizing text data for case-insensitive comparisons.
More Advanced SQL Functions
Set Operations
1. Union / Union All
Combines the result sets of two or more queries into a single result set, eliminating duplicate rows. Union All includes duplicates.
SELECT name
FROM employees
WHERE department = 'HR'
UNION
SELECT name
FROM contractors
WHERE department = 'HR';
Use-case: Retrieve all unique names (Union) or all names including duplicates (Union All) from both the employees and contractors tables where the department is 'HR'.
Data Coalescing
2. Coalesce
Returns the first non-NULL value in a list of expressions.
SELECT COALESCE(NULL, '', 'Hello', 'World');
Use-case: Return the first non-NULL and non-empty string. In this example, it will return 'Hello'.
Date and Time Manipulation
3. CAST Function
SELECT CAST('2023-09-03' AS DATE);
Use-case: Convert a string to a date type.Datetime and Date Format
Conditional Logic
4. Case When
Provides if-then-else logic to SQL.
SELECT name,
CASE WHEN age < 20 THEN 'Teenager'
WHEN age < 30 THEN 'Young Adult'
ELSE 'Adult'
END AS age_group
FROM users;
Use-case: Categorize users into different age groups based on their age.
Advanced Querying Techniques
5. CTE and Subquery
Common Table Expressions (CTE) and Subqueries break down complex queries into simpler parts.
WITH age_data AS
( SELECT age,
COUNT(*)
FROM users
GROUP BY age )
SELECT age
FROM age_data
WHERE count > 1;
Use-case: Both queries return ages that appear more than once in the users table. The first query uses a CTE for better readability and the second one uses a subquery.
Eliminating Duplicates
6. Distinct
Removes duplicate rows from the result set.
SELECT DISTINCT country FROM users;
Use-case: Retrieve all unique countries from the users table.
What Interviewers Look For
Front-Load Your SQL Prep
Many candidates who succeeded in interviews at companies like Meta, Google, and Uber put a significant emphasis on SQL from the beginning of their preparation. Aim to solve a minimum of 100 SQL problems to feel confident.
Final Thoughts
Mastering these SQL functions and knowing when to use them can significantly impact your problem-solving ability in data science and business analyst interviews. So, practice diligently to enhance your SQL skills.
Looking for a place to practice SQL? Check out this SQL practice pad 👉 InterviewQuery.