This lecture introduces the concept of database normalization and First Normal Form (1NF).
Database Normalization and 1NF
When we designed our blog database, we intuitively decided to split the data into three tables: `Authors`, `Posts`, and `Comments`. We did this to avoid repeating data. For example, if an author writes 100 posts, we don’t want to store their name and email 100 times.
This process of organizing tables and columns to reduce data redundancy and improve data integrity has a formal name: Normalization.
Normalization is a set of rules or guidelines called Normal Forms. The most common are First (1NF), Second (2NF), and Third (3NF) Normal Form. Let’s start with the first one.
First Normal Form (1NF)
A table is in First Normal Form if it meets two conditions:
- It has a primary key. This ensures every row is unique.
- Each cell contains a single, atomic value. This means you cannot have a column that contains multiple values, like a list or an array.
An Unnormalized Example
Imagine we tried to store a list of tags for each blog post directly in the `Posts` table, like this:
PostID | Title | Tags |
---|---|---|
1 | Learning SQL | ‘SQL, Databases, Tech’ |
2 | Fun with Joins | ‘SQL, Joins’ |
This table violates 1NF because the `Tags` column contains multiple values in a single cell. This is bad for several reasons:
- How do you search for all posts tagged with ‘SQL’? You would have to search inside a text string, which is slow and inefficient.
- How do you count how many posts have a specific tag? It’s very difficult.
- It’s hard to update or remove a single tag without affecting the others.
Making it 1NF
To fix this and bring the design into 1NF, we need to break the information into separate tables. We would create a `Tags` table and a `Post_Tags` table to link them.
Posts Table
PostID | Title |
---|---|
1 | Learning SQL |
2 | Fun with Joins |
Tags Table
TagID | TagName |
---|---|
101 | SQL |
102 | Databases |
103 | Tech |
104 | Joins |
Post_Tags Table (Junction Table)
PostID | TagID |
---|---|
1 | 101 |
1 | 102 |
1 | 103 |
2 | 101 |
2 | 104 |

Now, every cell in every table holds a single value. Our design is in 1NF. This structure, using a third “junction” table to manage a many-to-many relationship, is a fundamental pattern in database design.
In the Next Lecture…
We will continue our discussion of normalization by looking at Second Normal Form (2NF), which deals with how the non-key columns in a table relate to a composite primary key.