
When working with SQL, the IN
and NOT IN
operators are powerful tools for filtering data. While IN
behaves as expected, NOT IN
can produce surprising results when NULL values are involved. This post will help you understand and avoid this common pitfall.
Understanding the Basics
The IN operator checks if a value matches any value in a specified list:
-- Find employees in departments 1, 2, or 3
SELECT * FROM employees
WHERE department_id IN (1, 2, 3);
The NOT IN operator does the opposite – it finds values that aren’t in the list:
-- Find employees NOT in departments 1, 2, or 3
SELECT * FROM employees
WHERE department_id NOT IN (1, 2, 3);
The NULL Value Trap
Here’s where things get tricky. When NULL values are involved with NOT IN
, you might get unexpected results:
-- This query might return ZERO rows!
SELECT * FROM employees
WHERE department_id NOT IN (1, 2, 3, NULL);
Why does this happen? In SQL, comparing anything to NULL results in “unknown” (not true or false). The NOT IN
operator only returns rows where the value is definitely not in the list, and with NULL in the list, no value can be definitely “not in” the list.
Visual Example
Let’s use a simple employees
table to illustrate:
emp_id | name | dept_id | address |
---|---|---|---|
1 | Alice | 1 | California |
2 | Bob | 2 | Florida |
3 | Carol | 3 | Mountana |
4 | Dave | 4 | Texas |
5 | Eve | NULL | New York |
Here’s what different queries would return:
Query | Results | Explanation |
---|---|---|
WHERE department_id IN (1, 2) |
Alice, Bob | Employees in departments 1 or 2 |
WHERE department_id NOT IN (1, 2) |
Carol, Dave | Employees NOT in departments 1 or 2 (Eve excluded due to NULL) |
WHERE department_id NOT IN (1, 2, NULL) |
NONE | Returns nothing because of NULL in the list |
Safe Alternatives to NOT IN
Option 1 :Explicitly Handle NULL Values
SELECT * FROM employees
WHERE department_id NOT IN (1, 2, 3)
AND department_id IS NOT NULL;
By adding the IS NOT NULL
check, we ensure we’re only dealing with non-NULL values.
Option 2 :Use NOT EXISTS
SELECT * FROM employees e
WHERE NOT EXISTS (
SELECT 1 FROM departments d
WHERE d.id IN (1, 2, 3)
AND e.department_id = d.id
NOT EXISTS
performs better with NULL values because it checks whether any rows are returned by the subquery, not the values themselves.
Option 3 :Use LEFT JOIN with NULL Check
SELECT e.*
FROM employees e
LEFT JOIN departments d
ON e.department_id = d.id AND d.id IN (1, 2, 3)
WHERE d.id IS NULL;
Key Takeaways
- The
IN
operator works as expected, even with NULL values in the list. - The
NOT IN
operator can produce unexpected empty results when NULL values are present. - Always handle NULL values explicitly when using
NOT IN
by adding anIS NOT NULL
condition. - Consider using
NOT EXISTS
orLEFT JOIN
as safer alternatives that handle NULL values correctly.
2 thoughts on “SQL IN and NOT IN: The NULL Value Trap”
Great explanation of Singular Value Decomposition and its applications in matrix factorization. The Trainer API by Hugging Face seems like a powerful tool for simplifying machine learning workflows. It’s impressive how it makes AI projects more accessible to beginners. Could you provide more details on how SVD integrates with modern machine learning frameworks? WordAiApi
Thanks and glad that you found it helpful! Please stay tuned for more on SVD integration with modern ML and more.