Skip to content

Lecture 34: Database Normalization and 1NF

  • by
  • 3 min read
Setting Up Our Environment

This lecture introduces the concept of database normalization and First Normal Form (1NF).

Database Normalization and 1NF

When we designed our blog database, we intuitively decided to split the data into three tables: `Authors`, `Posts`, and `Comments`. We did this to avoid repeating data. For example, if an author writes 100 posts, we don’t want to store their name and email 100 times.

This process of organizing tables and columns to reduce data redundancy and improve data integrity has a formal name: Normalization.

Normalization is a set of rules or guidelines called Normal Forms. The most common are First (1NF), Second (2NF), and Third (3NF) Normal Form. Let’s start with the first one.

First Normal Form (1NF)

A table is in First Normal Form if it meets two conditions:

  1. It has a primary key. This ensures every row is unique.
  2. Each cell contains a single, atomic value. This means you cannot have a column that contains multiple values, like a list or an array.
An Unnormalized Example

Imagine we tried to store a list of tags for each blog post directly in the `Posts` table, like this:

PostIDTitleTags
1Learning SQL‘SQL, Databases, Tech’
2Fun with Joins‘SQL, Joins’

This table violates 1NF because the `Tags` column contains multiple values in a single cell. This is bad for several reasons:

  • How do you search for all posts tagged with ‘SQL’? You would have to search inside a text string, which is slow and inefficient.
  • How do you count how many posts have a specific tag? It’s very difficult.
  • It’s hard to update or remove a single tag without affecting the others.
Making it 1NF

To fix this and bring the design into 1NF, we need to break the information into separate tables. We would create a `Tags` table and a `Post_Tags` table to link them.

Posts Table

PostIDTitle
1Learning SQL
2Fun with Joins

Tags Table

TagIDTagName
101SQL
102Databases
103Tech
104Joins

Post_Tags Table (Junction Table)

PostIDTagID
1101
1102
1103
2101
2104
Setting Up Our Environment

Now, every cell in every table holds a single value. Our design is in 1NF. This structure, using a third “junction” table to manage a many-to-many relationship, is a fundamental pattern in database design.

In the Next Lecture…

We will continue our discussion of normalization by looking at Second Normal Form (2NF), which deals with how the non-key columns in a table relate to a composite primary key.

Lecture 33: Advanced Subqueries

Najeeb Alam

Najeeb Alam

Technical writer specializes in developer, Blogging and Online Journalism. I have been working in this field for the last 20 years.