What is Normalization in DBMS?

In a database, a huge amount of data gets stored in multiple tables. There can be the possibility of redundancy being present in the data. So Normalization in DBMS can be defined as the process which eliminates the redundancy from the data and ensures data integrity. Also, the normalization of data helps in removing the insert, update and delete anomalies.

How Does Normalization work in DBMS?

The normalization in the DBMS can be defined as a technique to design the schema of a database and this is done by modifying the existing schema which also reduces the redundancy and dependency of the data. So with Normalization, the unwanted duplication in data is removed along with the anomalies. In insert anomaly, the values such as null are not allowed to be inserted for a column.

In update anomaly, the data cannot be updated correctly because the same values occur multiple times in a column and in delete anomaly the deletion of a record creates inconsistency as it gets deleted from more than one row. So the aim of normalization is to remove redundant data as well as storing only related data in the table. This decreases the database size and the data gets logically stored in the database.

Types of Normalization in DBMS

The normal forms which are used most commonly in DBMS are as below:

First Normal Form (1F)
Second Normal Form (2F)
Third Normal Form (3F)
Boyce-Codd Normal Form (BCNF)

1. First Normal Form

The table or relation is said to be in First Normal Form if it does not contain any multi-valued or composite attributes. So the table or relation should contain only single-valued attributes for fulfilling the condition for First Normal Form.

Let us take the example of the STUDENT table as below:

Roll	Name	Subject
19	Rajesh	Math, Science
23	Supriya	History, English
32	Zack	Geography

The above table is not in First Normal Form as this contains the multi-valued attribute. The below table is transformed into the First Normal Form as it contains only atomic values.

Roll	Name	Subject
19	Rajesh	Math
19	Rajesh	Science
23	Supriya	History
23	Supriya	English
32	Zack	Geography

2. Second Normal Form

A relation or table to be in Second Normal Form should be in First Normal Form and it should not hold any partial dependency. So in Second Normal Form, the table should not contain any non-prime attribute depending upon the proper subset of any candidate key.

Let us consider the STUDENT table as cited previously as below:

Roll	Name	Subject
19	Rajesh	Math
19	Rajesh	Science
23	Supriya	History
23	Supriya	English
32	Zack	Geography

The above table needs to be broken into two tables as below to make it Second Normal Form compliant.

STUDENT

Roll	Name
19	Rajesh
23	Supriya
32	Zack

SUBJECT_DETAIL

Roll	Subject
19	Math
19	Science
23	History
23	English
32	Geography

The functional dependency from the table ‘STUDENT’ is removed and the column Subject in ‘SUBJECT_DETAIL’ is fully dependent on the primary key ‘Roll’.

3. Third Normal Form

A table is in Third Normal Form if it is in Second Normal Form and there should not be any transitive dependency for the non-prime attributes. So for every non-trivial functional dependency A->B, if any of the two conditions is true from the below, the relation is said to be in Third Normal Form.

A is a super key.
B is a prime attribute where each element of B is part of any candidate key.

Let us consider the table ‘EMPLOYEE’ as below:

EMP_ID	EMP_NAME	EMP_DEPT	EMP_STATE	EMP_ COUNTRY
289	Mike	Sales	Florida	U.S.
378	Sameer	Finance	Maharashtra	India
989	Nicki	Marketing	Texas	U.S.

The candidate key in the above table is EMP_ID and the functional dependency set is EMP_ID->EMP_NAME, EMP_ID->EMP_DEPT, EMP_ID->EMP_STATE, EMP_STATE -> EMP_COUNTRY. The EMP_COUNTRY is transitively dependent upon EMP_STATE. So we need to break the above table to two tables as below for transforming it to the Third Normal Form.

EMPLOYEE:

EMP_ID	EMP_NAME	EMP_DEPT	EMP_STATE
289	Mike	Sales	Florida
378	Sameer	Finance	Maharashtra
989	Nicki	Marketing	Texas

STATE_COUNTRY:

EMP_STATE	EMP_ COUNTRY
Florida	U.S.
Maharashtra	India
Texas	U.S.

The EMP_STATE becomes the primary key in the above table and the transitive dependency is removed.

4. Boyce-Codd Normal Form

For a table to be in Boyce-Codd Normal Form, it should be in Third Normal Form and for every functional dependency A->B, A is the super key in the table.

EMP_DEPT table:

ID	COUNTRY	DEPARTMENT	DEPT_TYPE	DEPT_NO
9890	India	Marketing	M098	045
11090	US	Finance	F0567	023
12390	India	Sales	S1002	012

The functional dependency for the above table is: ID -> COUNTRY, DEPARTMENT -> {DEPT_TYPE, DEPT_NO}. {ID, DEPARTMENT} is the candidate key. To transform the above table to BCNF, we have to break it into three tables as below:

EMP_COUNTRY:

ID	COUNTRY
9890	India
11090	US
12390	India

DEPT_DETAILS:

DEPARTMENT	DEPT_TYPE	DEPT_NO
Marketing	M098	045
Finance	F0567	023
Sales	S1002	012

EMP_DEPARTMENT_MAP:

ID	DEPARTMENT
9890	Marketing
11090	Finance
12390	Sales

The functional dependency for the above is ID -> EMP_COUNTRY, DEPARTMENT-> {DEPT_TYPE, DEPT_NO}. The candidate keys for the tables EMP_COUNTRY, DEPT_DETAILS and EMP_DEPARTMENT_MAP are ID, DEPARTMENT and {ID, DEPARTMENT}.

Advantages

Below are the advantages of Normalization:

Redundant data gets removed efficiently.
Improved data quality and flexibility in database designing.
The improved overall organization of data in the database.
Data is consistent and logically stored in the database.

Conclusion

Normalization plays a vital role in designing the database. It ensures data integrity and the reduction of unwanted data. With the advantages to offer, Normalization also comes with certain drawbacks which should be kept in the notice. A fully normalized data may present difficulties in understanding the complex business logic which in turn will increase the time to develop and implement. So the designer should have a keen understanding of normalization to use it effectively.

Quiz Result
Total Questions	Correct Answers	Wrong Answers	Percentage

What is Normalization in DBMS?

How Does Normalization work in DBMS?

Types of Normalization in DBMS

1. First Normal Form

2. Second Normal Form

3. Third Normal Form

4. Boyce-Codd Normal Form

Advantages

Conclusion

Recommended Articles

Follow us!

APPS

Blog

Courses

Email