This lecture introduces database indexes.
Imagine trying to find a specific topic in a book that has no index at the back. You would have to scan every single page from the beginning until you found it. This is what a database has to do by default when you search for data in a column that is not a primary key. This is called a full table scan, and it can be very slow on large tables.
A database index is a special data structure that works just like the index at the back of a book. It holds a copy of the values from a specific column (or columns) in a sorted order, along with a pointer to the location of the full row. This allows the database to find matching rows extremely quickly without having to scan the entire table.
How Indexes Work
When you create an index on a column, the database creates and maintains this separate, sorted list. When you run a query with a WHERE
clause on that indexed column, the database can quickly search the sorted index (which is very fast) to find the exact location of the rows it needs, and then jump directly to them.
Creating an Index
The syntax to create an index is simple.
CREATE INDEX index_name ON table_name (column_name);
Let’s go back to our `Posts` table from the blog project. We might frequently search for posts by their title. To speed this up, we can create an index on the `Title` column.
CREATE INDEX idx_posts_title ON Posts (Title);
It’s a common convention to name indexes with a prefix like `idx_` followed by the table and column name.
After creating this index, a query like SELECT * FROM Posts WHERE Title = 'Fun with Joins';
will be much faster on a large table.
The Trade-off of Indexes
If indexes are so great, why not create them on every single column? There is a trade-off:
- Faster Reads, Slower Writes: Indexes make
SELECT
queries much faster, but they make data modification (INSERT
,UPDATE
,DELETE
) slower. This is because every time you change the data, the database must also update the index. - Storage Space: An index takes up extra space on your disk.
Therefore, you should only create indexes on columns that you frequently use in WHERE
clauses for searching or in JOIN
conditions.
Note: Primary key columns are automatically indexed by the database, which is why it’s always fast to search by ID.
In the Next Lecture…
We will discuss how to handle missing or unknown data in our database using the NULL
value and the special operators IS NULL
and IS NOT NULL
.
Lecture 36: Third Normal Form 3NF
