The Intricacies of Null Values in MySQL

RMAG news

Introduction:

For those who are new to MySQL, including myself, you might encounter unexpected results when you query from a table that has some null values. In MySQL, null is neither true nor false but unknown, and you can’t compare null with null unless you use the ‘IS NULL’ or ‘IS NOT NULL’ operator. I found that the way MySQL treats null is not only interesting but also important to understand. Here, I’m demonstrating how null behaves using various MySQL functions!

Joining Tables with Columns That Contain Null Values

NULL with INNER JOIN vs. NULL with LEFT JOIN
Let’s say there are two tables, as described below:

Table name : teachers

id
dept_id
name

101
1
Shrivell

102
1
Throd

103
1
Splint

104
NULL
Spiregrain

105
2
Cutflower

106
NULL
Deadyawn

Table name : departments

id
name

1
Computing

2
Design

3
Engineering

NULL
NULL

Notice that there are null values in the teachers table. What would happen if I use an INNER JOIN to combine these two tables?

SELECT teachers.name, departments.name AS department
FROM teachers INNER JOIN departments
ON (teachers.dept_id = departments.id)

Here is the result

+———–+————+
| Name | Department |
+———–+————+
| Shrivell | Computing |
| Throd | Computing |
| Splint | Computing |
| Cutflower | Design |
+———–+————+

Notice that in the results, you don’t see the names of two teachers whose department_ids are null.

But what about using a LEFT JOIN? Let’s take a look.
Here is the query where I simply swap ‘INNER’ for ‘LEFT’.

SELECT teachers.name, departments.name AS department
FROM teachers LEFT JOIN departments
ON (teachers.dept_id = departments.id)

Here is the result

+————+————+
| Name | Department |
+————+————+
| Shrivell | Computing |
| Throd | Computing |
| Splint | Computing |
| Spiregrain | NULL |
| Cutflower | Design |
| Deadyawn | NULL |
+————+————+

You’ll notice that all the teachers are listed. INNER JOIN does not return records with null values, but LEFT JOIN does.

WHY?

Keep in mind that when using the equality operator (=) with NULL to NULL, the result is neither true nor false—it’s unknown. Therefore, records with NULL values on both sides of the comparison are not included in the result set when using INNER JOIN. However, by the nature of LEFT JOIN, it selects ALL records from the left side table (which is the teachers table in this case), even if the dept_id is null. Unlike INNER JOIN, which would remove such records from the list, LEFT JOIN retains them.

EXISTS & NULL and IN & NULL

Both EXISTS and IN do not return records with null values.
Let’s consider the example where I have these tables.

Table name: customers

customer_id
first_name
last_name
phone_number

1
John
Doe
123-456-7890

2
Jane
Smith
987-654-3210

3
Alice
Johnson
NULL

4
Bob
Brown
555-555-5555

Table name: customers_2

customer_id
first_name
last_name
phone_number

1
John
Doe
123-456-7890

2
Jane
Smith
NULL

3
Alice
Johnson
111-222-3333

4
Bob
Brown
NULL

Here is a query:
Using EXISTS to check if a name in b_table exists in a_table.

SELECT * FROM customers AS c1
WHERE EXISTS(
SELECT * FROM customers_2 AS c2
WHERE c1.first_name = c2.first_name
AND c1.last_name = c2.last_name
AND c1.phone_number = c2.phone_number);

Using IN to see if name in b_table is in a_table

SELECT *
FROM customers AS c1
WHERE (c1.first_name, c1.last_name, c1.phone_number) IN (
SELECT c2.first_name, c2.last_name, c2.phone_number
FROM customers_2 AS c2
);

Both returns the same result

+————-+————+———–+————–+
| customer_id | first_name | last_name | phone_number |
+————-+————+———–+————–+
| 1 | John | Doe | 123-456-7890 |
+————-+————+———–+————–+

Notice that records with null values are not returned.

WHY?
Because with EXISTS, you can’t link columns that are null. For IN, it ignores null values as if they don’t exist. Thus, both treat nulls in the same way: neither returns nulls.

** HOWEVER, it’s not the same story with NOT EXISTS and NOT IN when dealing with NULL!**

Let’s take a look.
I simply swapped EXISTS for NOT EXISTS.

SELECT * FROM customers AS c1
WHERE NOT EXISTS(
SELECT * FROM customers_2 AS c2
WHERE c1.first_name = c2.first_name
AND c1.last_name = c2.last_name
AND c1.phone_number = c2.phone_number);

Here is the result

| customer_id | first_name | last_name | phone_number |
|————-|————|———–|—————|
| 2 | Jane | Smith | 987-654-3210 |
| 3 | Alice | Johnson | [NULL] |
| 4 | Bob | Brown | 555-555-5555 |

Notice that it is returning records with null values. Both Jane and Bob’s phone numbers in the customers_2 table are null, but since both phone numbers have values in the customers table, the returned values are not null.

How about using NOT IN?

SELECT *
FROM customers AS c1
WHERE (c1.first_name, c1.last_name, c1.phone_number) NOT IN (
SELECT c2.first_name, c2.last_name, c2.phone_number
FROM customers_2 AS c2
);

Here is the result

+————-+————+———–+————–+
| customer_id | first_name | last_name | phone_number |
+————-+————+———–+————–+

Nothing is returned!
Why does NOT EXISTS return null values but NOT IN does not? It’s because they both treat null differently. NOT EXISTS is simply the opposite of EXISTS. It returns EVERYTHING that is not returned by EXISTS, including nulls.

However, NOT IN returns records that evaluate to TRUE, but in MySQL, null is neither TRUE nor FALSE—it is unknown.

Let’s take a look at the record for Jane Smith with Id2.
When NOT IN is used, it compares:
Jane != Jane OR Smith != Smith OR 987-654-3210 != null.

Jane != Jane -> False
Smith != Smith -> False
987-654-3210 != null -> NULL
False or False or Null evaluates to NULL, so the record doesn’t get returned!

Here’s a little twist:
What if the first name in the customers table does not match the first name in the customers_2 table?

Let’s compare:
Jane Smith 987-654-3210 (from the customers table) and Jack Smith null (from the customers_2 table).

Here is the result

+————-+————+———–+————–+
| customer_id | first_name | last_name | phone_number |
+————-+————+———–+————–+
| 2 | Jack | Smith | 987-654-3210 |

What? You see Jack! Doesn’t NOT IN fail to evaluate null as either false or true, so the record shouldn’t be returned?

Let’s analyze it:

Jane != Jack -> True
Smith != Smith -> False
987-654-3210 != null -> NULL

True or False or Null actually returns:
True || False -> True
True || NULL -> True
Therefore, the record gets returned!

If you want to understand how true, false, and null are evaluated, here is the link:

https://dev.mysql.com/doc/refman/8.0/en/logical-operators.html#operator_and

As you can see, it can get quite confusing when it comes to comparing null values.

At the end of the day, I think it’s best to explicitly compare null values using IS NULL or IS NOT NULL, which returns true or false as shown below.

SELECT * FROM customers AS c1
WHERE EXISTS(
SELECT * FROM customers_2 AS c2
WHERE c1.first_name = c2.first_name
AND c1.last_name = c2.last_name
AND c1.phone_number = c2.phone_number OR (c1.phone_number IS NULL AND c2.phone_number IS NULL));

Conclusion

Handling null values in MySQL can be quite challenging, requiring careful attention when they are present in your database. It’s crucial to conduct explicit null checking to clearly define how null values should be treated. By implementing the strategies I mentioned earlier, such as using IS NULL or IS NOT NULL, you can ensure more predictable and accurate query results.

Leave a Reply

Your email address will not be published. Required fields are marked *