Simplify SQL Queries with CTEs

Been refactoring some messy SQL queries at work lately. Found a pattern that made everything cleaner. The general usage of sql: ------- SELECT department_id, AVG(salary) FROM ( SELECT * FROM employees WHERE hire_date > '2020-01-01' ) recent_hires GROUP BY department_id HAVING AVG(salary) > ( SELECT AVG(salary) FROM employees ); ------- It works. But reading it is hard sometimes. We can use CTEs instead: -------- WITH recent_hires AS ( SELECT * FROM employees WHERE hire_date > '2020-01-01' ), company_avg AS ( SELECT AVG(salary) AS avg_salary FROM employees ) SELECT department_id, AVG(salary) FROM recent_hires GROUP BY department_id HAVING AVG(salary) > (SELECT avg_salary FROM company_avg); -------- Same result. But now each piece has a name. We can read top to bottom. Why use CTEs: · Break big queries into small named chunks · Can reuse the same CTE multiple times · Makes code reviews easier (people actually understand what you wrote) · Recursive ones are great for org charts or nested categories ----- When to skip CTEs: · Really simple queries (don't over-engineer) · Huge intermediate results (temp table performs better) · Need indexes on the intermediate data ----- Bottom line: If your SQL has nested subqueries more than one level deep, try a CTE. Makes life easier. #SQL #CTE #Database #DataEngineering #PostgreSQL #MySQL

  • No alternative text description for this image

To view or add a comment, sign in

Explore content categories