SQL PostgreSQL Distinct Keyword

9 followers

1mo

SQL Fundamentals Series (PostgreSQL Edition) — Part 5 When working with real datasets, duplicate values are common. For example, a customer table may contain many records from the same country. If you query the column normally, you’ll see repeated values. In SQL, the DISTINCT keyword is used to return unique values only. Example: SELECT DISTINCT first_name FROM customer; This query retrieves a list of first_name from the customer table without duplicates. Instead of returning every row, the database returns each unique first_name once. DISTINCT becomes especially useful when you want to understand the different categories within a dataset, such as: • unique countries • unique product categories • unique customer segments These types of queries are frequently used when exploring data in systems like PostgreSQL. #SQL #PostgreSQL #DataEngineering #DataAnalytics

To view or add a comment, sign in

More Relevant Posts

neztransdig

9 followers
1mo
Report this post
SQL Fundamentals Series (PostgreSQL Edition) — Part 8 When working with grouped data, filtering becomes slightly different. Earlier, we used the WHERE clause to filter rows. But once you introduce GROUP BY, filtering must happen after aggregation. This is where the HAVING clause comes in. In SQL, HAVING is used to filter grouped results. Example: select name as categoryname from category group by name having count(*) <5; This query: • groups category by name • counts the number of name in each category • returns only name with less than 5 appeared Key difference: WHERE filters rows before grouping HAVING filters groups after aggregation This distinction is critical when analyzing data in systems like PostgreSQL. Understanding when to use WHERE vs HAVING is what allows you to write accurate analytical queries. #SQL #PostgreSQL #DataEngineering #DataAnalytics
Like Comment
To view or add a comment, sign in
neztransdig

9 followers
1mo
Report this post
SQL Fundamentals Series (PostgreSQL Edition) — Part 7 Retrieving data is useful. But real insights come from summarizing data. In SQL, this is done using the GROUP BY clause. GROUP BY allows you to group rows that share the same value and apply aggregate functions like COUNT(). Example: select district, count(*) from address group by district This query groups district by address and returns the number of district in each address. Instead of looking at district rows, you now get a summary view of the data. This is how analysts answer questions like: • How many district are in each address? • Which category appears most frequently? • How is data distributed across groups? GROUP BY is one of the most important tools when analyzing datasets in systems such as PostgreSQL. #SQL #PostgreSQL #DataEngineering #DataAnalytics
Like Comment
To view or add a comment, sign in
Teleola Adekoya
1mo
Report this post
SQL Pocket Guide – Alice Zhao Day 13/30 | Pages 137–147 Today’s lesson focused on SQL data types—an essential building block for creating structured and reliable databases. In SQL, each column can only include values of a single data type. Choosing the right type ensures data integrity and helps queries run efficiently. Example: CREATE TABLE my_table ( id INT, name VARCHAR(30), dt DATE ); INT – stores integer values VARCHAR(30) – stores text up to 30 characters DATE – stores date values Data types go beyond just INT, VARCHAR, and DATE. There are four main categories, each with multiple subtypes, and syntax can vary depending on your database system (MySQL, SQL Server, PostgreSQL, etc.). Key takeaway: Picking the right data type is not just a technical detail—it’s the foundation of accurate, efficient, and maintainable databases. Day 13 with the 33-person SQL study group reminded me that even small decisions, like data type selection, have a big impact on how we can analyze and trust our data. How do you decide which data type to use when designing a new table? #StudyWithTele #SQLChallenge #DataAnalytics #LearningInPublic #30DaysOfConsistency #
2 Comments
Like Comment
To view or add a comment, sign in
Kenneth Onwubiko
1mo Edited
Report this post
Just wrapped up a track on Window Functions in PostgreSQL and I’ll be honest, this is where SQL started feeling a lot more powerful. I moved past basic queries into things like ranking data with ROW_NUMBER(), RANK(), and DENSE_RANK(), and comparing rows using LEAD() and LAG(). Getting comfortable with PARTITION BY and the OVER() clause really changed how I think about analyzing data without losing detail. Also spent time working with ROLLUP and CUBE, which made building summary reports across multiple levels way easier than I expected. Big takeaway for me: you don’t always need to group and lose your data just to analyze it. Window functions let you keep everything and still get deep insights. Looking forward to applying this more in my projects and everyday use of PostgreSQL. #SQL #PostgreSQL #DataAnalytics

2 Comments
Like Comment
To view or add a comment, sign in
Ayooluwa Joseph
1mo Edited
Report this post
Day 11 of my 30-day challenge with Alice Zhao’s "SQL Pocket Guide", covering pages 116–125 in Chapter 5. Today’s reading in the book was all about modifying tables. Here are my top three takeaways from today’s reading: 1. Renaming a Table I learned that even after a table is built, you can give it or its columns a total makeover. Most systems (like MySQL, Oracle, and Postgres) use ALTER TABLE ... RENAME TO, while SQL Server has its own special command, EXEC sp_rename. It’s a great reminder that clear naming conventions are a journey, not just a destination. 2. Adding and Deleting Columns Data needs grow, and the book shows how to use the ADD and DROP COLUMN commands to expand or trim tables. One interesting nuance: in SQLite, deleting a column or modifying a constraint isn't a direct command; you actually have to manually create a new table, copy the data over, and delete the old one. It's a bit more work, but it gets the job done. 3. The Most Important "WHERE" in Your Career This was the biggest lesson of the day: The Update Warning. When you use the UPDATE ... SET command to update rows of data, if you forget to include a WHERE clause, the book warns that the entire table will be updated. Always run a SELECT statement with your WHERE criteria first to "preview" exactly which rows you’re about to change before you hit the point of no return. At this point, I like to visualize a database as a living, breathing thing that I can reshape as my data questions evolve. #StudyWithTele #SQLChallenge #30DaysOfConsistency #SQLPocketGuide #DataEngineering #DatabaseDesign
Like Comment
To view or add a comment, sign in
Yashkumar Bhongade
1mo
Report this post
"Built a complete Grocery Delivery DB in PostgreSQL — 6 tables, real data, 25 SQL queries from basic to advanced. Sharing for anyone learning SQL! Honestly? I don't know everything yet. Some queries I wrote myself. Some I struggled with. Some I took help to understand. But that's exactly where I am right now — learning, practicing, and being consistent. This document has all 25 queries sorted Easy → Hard, with the schema and everything clean. #SQL #PostgreSQL #DataAnalytics #LearningInPublic #SQLPractice #DataAnalyst
Like Comment
To view or add a comment, sign in
Anurag Singh
1mo
Report this post
Hello Everyone, When I started learning data analytics, I realized something important— before analyzing data, you need a place to store and manage it properly. So in this part of my journey, I set up my PostgreSQL environment and created my first database from scratch. Here’s what I learned step by step: ✅ Installing PostgreSQL and understanding its components ✅ Creating and connecting to my first database ✅ Designing tables to store structured data ✅ Inserting data and running basic SQL queries It may seem simple, but this is the foundation of everything in data. No database → No data → No analysis. 💬 What was your first experience with databases or SQL? #PostgreSQL #SQL #DataAnalytics #LearningJourney #Database #Upskilling #DataEngineering
Like Comment
To view or add a comment, sign in
Deepak Lodh
1mo Edited
Report this post
Today I learned something interesting about B-Trees in SQL Server. Even though indexes help organize data, SQL Server still uses a B-Tree structure to quickly navigate through the data without scanning all the pages. A B-Tree mainly consists of three levels: • Root Node – The starting point of the search • Intermediate Nodes – Index pages that store key values and pointers to other pages • Leaf Level – The final level where SQL Server reaches the data pages One interesting thing I learned is that index pages do not store the actual data. Instead, they store key values and pointers that guide SQL Server to the correct page. Think of it like road signboards on a highway 🚗 They don’t contain the destination itself, but they tell you where to go next. Because of this structure, SQL Server can locate the required data in just a few steps instead of scanning the entire table, which is crucial when working with millions of rows. 📌 I’ve attached a small diagram to visualize how the B-Tree structure works internally. In my next posts, I’ll explore how this B-Tree structure is used differently in clustered and non-clustered indexes. Curious to hear from database engineers here: What analogy do you use to explain B-Trees to beginners? #SQL #SQLServer #Database #AIEngineering #DataScience
5 Comments
Like Comment
To view or add a comment, sign in
Ayooluwa Joseph
1mo
Report this post
Day 10 of my 30-day challenge with Alice Zhao’s "SQL Pocket Guide", covering pages 106–115 in Chapter 5. Here are my top three takeaways from today’s deep dive into populating and linking tables: 1. The Golden Rules of Keys I learned that while constraints keep data clean, Primary and Foreign Keys are what keep it organized. A Primary Key must uniquely identify every single row and, ideally, should be immutable. Meanwhile, a Foreign Key is the bridge that links a row in one table to a Primary Key in another, creating the relationships that make relational databases so powerful. 2. Letting the Database Do the Work (Auto-IDs) Manually assigning a unique ID to every new row is a recipe for a headache. The book showed me how to use automatically generated fields, such as AUTO_INCREMENT in MySQL, SERIAL in PostgreSQL, or IDENTITY in SQL Server, so the system handles numbering for me. It’s a simple, set-it-and-forget-it step during table creation that prevents duplicate ID errors later. 3. Bulk Loading Like a Pro Inserting data one row at a time is fine for practice, but real-world data lives in files. I practiced two power moves for moving data at scale: --> INSERT INTO ... SELECT: This allows me to take the results of a query and dump them directly into a new table, perfect for creating summary tables. --> The CSV Import: Whether using LOAD DATA or BULK INSERT, I learned how to tell the database exactly how to read a text file, from identifying delimiters (like commas) to handling headers. I didn't realise that different databases interpret missing data in a CSV differently. For example, a missing value might be NULL in PostgreSQL but an empty string in MySQL. Knowing these nuances is the difference between clean data and a debugging nightmare! #SQL #DataEngineering #30DayChallenge #SQLPocketGuide #LearningJourney #DatabaseDesign
Like Comment
To view or add a comment, sign in
Ajay Satpute
1mo
Report this post
**How Data is Stored in a SQL Database (Basic → Advanced)** Most developers use SQL daily… but very few understand how data is actually stored under the hood. Let’s break it down 1. Basic (What we see) Data is stored in tables Rows → Records Columns → Fields Simple, right? Just like Excel. 2. Intermediate (What actually happens) Inside the database: Data is stored in pages (8 KB each) Pages are grouped into extents (64 KB) Structure: Database → Files → Pages → Rows 3. Storage Types Heap → Unordered, faster inserts Clustered Index → Data stored in sorted order 4. Indexes (Why queries are fast) Indexes use a B-Tree structure Helps SQL find data quickly Converts full scans into efficient seeks 5. Advanced (What DBAs care about) Buffer Cache → Data is first read from memory, not disk Transaction Log → Every change is logged before saving (for recovery) Locking & Concurrency → Multiple users can safely access data MVCC (PostgreSQL) → Multiple versions of rows to avoid blocking 6. What happens when you run a query? SELECT → Optimizer → Execution Plan → Index/Page Access → Result Key Takeaway: If you want to optimize SQL performance, don’t just write queries… Think in terms of Pages, Indexes, and Memory vs Disk Have you ever debugged a slow query and found the real issue was storage or indexing? #SQL #Database #SQLServer #PostgreSQL #DataEngineering #PerformanceTuning #TechLearning
3 Comments
Like Comment
To view or add a comment, sign in

9 followers

View Profile Connect

SQL PostgreSQL Distinct Keyword

More Relevant Posts

Explore related topics

Explore content categories