SQL Joins Explained with Examples

🚀 SQL Joins – Deep Dive (Hands-on Learning) Spent some time strengthening my understanding of SQL Joins using real datasets. Sharing both concepts + queries 👇 🔹 INNER JOIN – Only matching records SELECT * FROM samples.bakehouse.sales_customers c INNER JOIN samples.bakehouse.sales_transactions t ON t.customerID = c.customerID; 🔹 LEFT JOIN / LEFT OUTER JOIN – All from left + matches SELECT * FROM samples.bakehouse.sales_customers c LEFT JOIN samples.bakehouse.sales_transactions t ON t.customerID = c.customerID; 🔹 LEFT ANTI JOIN – Records in left NOT in right SELECT * FROM samples.bakehouse.sales_customers c LEFT ANTI JOIN samples.bakehouse.sales_transactions t ON t.customerID = c.customerID; 🔹 LEFT SEMI JOIN – Records in left that HAVE matches SELECT * FROM samples.bakehouse.sales_customers c LEFT SEMI JOIN samples.bakehouse.sales_transactions t ON t.customerID = c.customerID; 🔹 RIGHT JOIN / RIGHT OUTER JOIN – All from right + matches SELECT * FROM samples.bakehouse.sales_customers c RIGHT JOIN samples.bakehouse.sales_transactions t ON t.customerID = c.customerID; 🔹 SEMI JOIN (engine-specific) SELECT * FROM samples.bakehouse.sales_customers c SEMI JOIN samples.bakehouse.sales_transactions t ON t.customerID = c.customerID; 🔹 CROSS JOIN – Cartesian product SELECT * FROM samples.bakehouse.sales_customers c CROSS JOIN samples.bakehouse.sales_transactions t; 💡 Key Learnings: Choosing the right join = better performance + correct results LEFT ANTI & SEMI joins are powerful for filtering datasets (especially in PySpark/Databricks) Joins are core to ETL pipelines & analytics workflows #SQL #DataEngineering #DataAnalytics #PySpark #Databricks #Learning #100DaysOfCode

To view or add a comment, sign in

Explore content categories