Skip to content

Lecture 38 – Handling Missing Data with NULL

Lecture 38 Handling Missing Data with NULL

This lecture covers handling NULL values. In the real world, data is often incomplete. A customer might not have a phone number, or a post might not have any content yet. In SQL, the special value NULL is used to represent missing or unknown data. It is not the same as an empty string ('') or the number zero (0).

NULL is a placeholder for data that does not exist. Understanding how to work with it is crucial.

The Problem with `NULL`

NULL behaves in a special way with standard comparison operators. You cannot test for NULL by using operators like = or !=. A condition like WHERE MyColumn = NULL will never be true, because NULL is considered to have an unknown value, so it cannot be said to be equal to anything, not even another NULL.

The Correct Way: `IS NULL` and `IS NOT NULL`

To check for NULL values, SQL provides two special operators:

  • IS NULL: This condition is true if the column’s value is NULL.
  • IS NOT NULL: This condition is true if the column’s value is not NULL.

Let’s create a sample table to see this in action.

CREATE TABLE Products (
    ProductID INTEGER PRIMARY KEY,
    ProductName TEXT,
    Description TEXT
);

INSERT INTO Products (ProductName, Description) VALUES ('Laptop', 'A powerful computer');
INSERT INTO Products (ProductName) VALUES ('Mouse'); -- Description is NULL
INSERT INTO Products (ProductName, Description) VALUES ('Keyboard', ''); -- Description is an empty string

Finding Rows with Missing Data

To find all products that are missing a description, we must use IS NULL.

SELECT * FROM Products WHERE Description IS NULL;
Lecture 38 Handling Missing Data with NULL
Lecture 38 Handling Missing Data with NULL

This will correctly return the ‘Mouse’ product. The ‘Keyboard’ product will not be returned, because an empty string is a value, not the absence of a value.

Finding Rows with Data

To find all products that have a description, we use IS NOT NULL.

SELECT * FROM Products WHERE Description IS NOT NULL;

This will return both the ‘Laptop’ and the ‘Keyboard’, because both have a value in the `Description` column (even though one is empty).

`NULL` in Aggregate Functions

Aggregate functions like SUM()AVG()MIN(), and MAX() simply ignore NULL values in their calculations. The COUNT() function is a bit special:

  • COUNT(*): Counts all rows, regardless of any NULL values.
  • COUNT(column_name): Counts all rows where the specified column is not null.
In the Next Lecture…

We will look at data integrity constraints. We have already seen some of these, like PRIMARY KEY and NOT NULL, but we will explore others like UNIQUECHECK, and DEFAULT that help ensure the data in our database is always valid and consistent.

Lecture 37: Speeding Up Queries with Indexes

Najeeb Alam

Najeeb Alam

Technical writer specializes in developer, Blogging and Online Journalism. I have been working in this field for the last 20 years.