WHERE and HAVING, NOT IN and NOT EXISTS do similar things in SQL, but when should you use which? WHERE and HAVING: The main difference WHERE → works on individual rows. It is applied before GROUP BY — it filters the ‘raw’ data. HAVING → works with the results of aggregate functions (COUNT, SUM, AVG, …), in simpler terms, with groups. It is applied after GROUP BY — filtering already aggregated groups. When to use: WHERE → whenever possible (faster, reduces the volume of data in advance) HAVING → only when filtering by aggregates or by GROUP BY results is required Together, when WHERE comes before GROUP BY and HAVING comes after. NOT IN and NOT EXISTS: The main difference NOT IN → comparison with a list. Checks that the value is not in the list. But there is a critical nuance: if there is at least one NULL in the subquery, the result may be empty. The reason is the peculiarities of SQL’s three-valued logic (TRUE / FALSE / UNKNOWN). May be less efficient on large datasets. NOT EXISTS → checking for the existence of rows Checks that no row exists that satisfies the condition. Not affected by NULL values. Often optimises better (especially with indexes). When to use: Use NOT EXISTS by default if there is a subquery, NULL values are possible, and reliability is important. You can use NOT IN if you are 100% certain there are no NULL values and the list of values is simple. #Java #Backend #Developer #JavaDevelompent #Software #Programming #SQL #Relational #Database #DB #PostgreSQL #Oracle #MySQL #MariaDB
Nicely done especially on the NULL thing with NOT IN. Most people don't even bring that up.
Great breakdown. especially calling out the NULL behavior in NOT IN, that’s something many overlook. 👍 Clear, practical, and directly applicable in real-world queries.
If it is possible to learn to use NOT EXISTS instead of NOT IN, it should also be possible to learn to add the condition 'WHERE ReturnColumn IS NOT NULL' when NOT IN is used. In SQL Server, both constructs are changed to INNER JOIN by the compiler.