Normalization in DBMS

Normalization in DBMS

What is Normalization?

The process by which a bigger dataset is further broken down into well-structured smaller multiple tables which reduces inconsistency, that does not cause any undesired result during any operation like insert, update, and delete is executed.

This ensures that there is no unnecessary duplication of data and the tables with proper relation between them.

Example

Let's say we have a group of superhero cards in the same box.

It is difficult to find only an IronMan card, we have to search the whole box and every card gets it.

If we separate them by which superhero belongs to MARVEL or DC and put them in the different boxes named MARVEL superheroes and DC superheroes. Now it makes it one step easier to find an IronMan card since we can search only in the MARVEL box instead of searching for whole cards.

Then we can break down each box further by sorting the cards by their nature like God or Human or alien.

Why Normalization?

Normalization improves efficiency and accuracy in data operation because the smaller datasets have a proper relation between them and no unnecessary duplicates. It reduces memory wastage and loss of data can be reduced to a large extent.

Terminologies

  • Tuple: A single row of data in a table or relation.
  • Atomic value: A value that cannot be further divided or broken down into smaller parts.
  • Prime attribute: A attribute that is used to uniquely identify a tuple in the relation.
  • Nonprime attribute: A attribute does not contribute to the uniqueness of a tuple in the relation.
  • Super key: A set of one or more attributes that can uniquely identify a tuple in a relation.
  • Candidate Key: A minimal super key, meaning that it is a super key with no nonprime attributes.
  • Partial dependency: A nonprime attribute depends on only a subset of the candidate key attributes, rather than on the entire set.
  • Transitively dependency: A nonprime attribute depends on another nonprime attribute, which in turn depends on part of a candidate key.

Types of Normalization

  1. First Normal Form(1NF)
  2. Second Normal Form(2NF)
  3. Third Normal Form(3NF)
  4. Boyce Codd's Normal Form(BCNF)

First Normal Form

For a table to justify 1NF it needs to satisfy:

  • Each column should contain atomic values.
  • Each Value stored in a column should be of the same type.
  • All the columns in a table should have unique names.

Example

The below table has several columns that store information about employees, such as their IDs, names, address, and mobile number. And this table is not normalized since there are columns with multiple values.

No alt text provided for this image
Employee details

After the first normalization, we should ensure that "each column has atomic value". The Emp_Moblie column has multiple values of phone numbers. Now we have to break it into its atomic values as the below table.

No alt text provided for this image
Employee details after 1NF

Now this table ensures data atomicity.

Second Normal Form

For a table to justify 2NF it needs to satisfy: 

  • It should be in its 1NF. 
  • It should not have any partial dependencies.

Example

The below table stores the list of teacher's detail in a school. And this table is in 1NF form as each column does not have multiple values. But it does not follow 2NF because "A table should not have any partial dependencies". There are teachers in the same age group.

No alt text provided for this image
Teacher's details

The Teachers_Age alone does not decide either the Subject or the Teacher_id and Teachers_Age along with any other column does not decide the other column. So the Teacher_Age is a "nonprime" column.

No alt text provided for this image
Teacher's Age
No alt text provided for this image
Teacher's subject

We break down the table into two different tables as Teacher_Id with Teacher_Age and Teacher_Id with Subject, to eliminate partial dependency.

Third Normal Form (3 NF)

For a table to be in the third normal form: it should satisfy:

  • It should already be in the 2NF.
  • It should not have Transitive Dependency.

Example

The below table stores the list of employee's detail of a company.

No alt text provided for this image
Employee_Details

Here, Emp_State, Emp_City & Emp_District dependent on Emp_Zip. and Emp_zip is dependent on Emp_Id which makes non-prime attributes (Emp_State, Emp_City & Emp_District) transitively dependent on the superKey(Emp_Id). This violates the rule of 3NF. So will break down the table to

No alt text provided for this image
Employee Table
No alt text provided for this image
Employee_Address table

Here, Emp_State, Emp_City & Emp_District just dependent on Emp_Zip.

Now these tables follow 3NF.

Boyce-Codd Normal Form (BCNF)

For a table to be in the Boyce-Codd Normal Form, it should satisfy: 

  • It should be in the Third Normal Form.  
  • Every determinant should be a candidate key.

Example

The below table stores the list of employee's detail of a company.

No alt text provided for this image
Employee table

In this table, the column Dept_Type and Dept_No_Of_Emp depends on Emp_Dept and Emp_Nationality depends on Emp_Id. And there are two candidate keys Emp_Id and Emp_Dept. We need to separate the table based on the candidate keys.

No alt text provided for this image
Employee Nationality table
No alt text provided for this image
Employee Department table
No alt text provided for this image
Relation table of both the Employee Nationality table and the Employee Department table

This table contains the "candidate key "of both the previous table linked into one table which serves as a relation.

BCNF eliminates all potential for redundancy and data anomalies resulting from functional dependencies.

To view or add a comment, sign in

More articles by Vignesh Muniyappan

  • Real-Time Web Communication Simplified with WebSockets and STOMP

    In today's web applications, real-time communication is a must. Users expect instant updates and interactive features.

  • Spring Boot - P5

    Title: Building User Registration and Login System using Spring Boot Introduction: In this tutorial, we'll guide you…

  • Angular - P1

    What is Angular? Angular is a popular open-source web application framework developed by Google. It's used for building…

  • Spring boot P-4

    An Introduction to Spring Boot MVC In the realm of web development, Spring Boot has emerged as a powerful framework…

  • SpringBoot - P3

    Exploring Annotations in Springboot: Introduction: Annotations play a pivotal role in modern Java programming…

  • SpringBoots - P2

    What and Why Spring Initializer? Spring Initializer is a web tool for quickly creating Spring Boot projects. It offers…

  • SpringBoots - P1

    What Is SpringBoots? A popular framework to build Java applications. Built on Spring Framework.

  • Manual Testing vs Automation Testing

    What is Software Testing? Process of evaluating and verifying that a software product or application does what it is…

    3 Comments
  • Introduction to Servlet and JSP Development

    Web development forms the backbone of the modern digital world. From simple websites to complex web applications…

  • OBJECT ORIENTED PROGRAMING IN JAVA

    Introduction: Java is an object oriented programming language which is designed to program using classes and objects…

Others also viewed

Explore content categories