This tutorial introduces the concept and principles of database normalization and takes you through the most common normal forms.

Problem Description


The store contains a "customer_order" table that has the complete details of all transactions made by the customers in the store like transaction order, product and customer information. The table has duplicate entries for customer and product information.

Disorganized and redundant data leads to performance degradation as well as difficulties in administration of the database. It also leads to anomalies during insert, update, delete operations and hence data corruption.

Operation


In order to eliminate data redundancy and assist in database administration and maintainence, the "customer_order" table is subjected to normalization.

Solution


Normalization is a process in which an initial DB design is transformed, or decomposed, into a different, but equivalent, design. The resulting schema is equivalent to the original one in the sense that no information is lost when going from one to the other. There are a few goals for deploying a normalized design :

  1. Eliminate redundant data, for example, storing the same data in more than one table
  2. Enforce valid data dependencies by only storing related data in a table, and dividing relational data into multiple related tables
  3. Maximize the flexibility of the system for future growth in data structures

It is possible to transform an original schema into a resulting one that satisfies certain conditions, known as Normal Forms :

  1. First Normal Form (1NF) - No repeating elements or groups of elements
  2. Second Normal Form (2NF) - No partial dependency on a composite key.
    (A key uniquely determines a record. A composite key is one consisting of several fields or columns)
  3. Third Normal Form (2NF) - No dependencies on non-key attributes
    (In other words, no non-key field functionally depends on any other non-key field. Such indirect dependencies are known as transitive dependencies)

There are three more normal forms namely the Boyce-Codd Normal form (BCNF), Fourth Normal form (4NF) and Fifth Normal form (5NF). These are rarely used, hence not discussed in this tutorial.