NULL Handling in SQL — The 5 Mistakes That Defeat Most Analysts

Published May 2026 · 12 min read · Drill 10+ NULL challenges →

NULL is the most-misunderstood part of SQL — and the most expensive part to misunderstand. Every analyst has shipped a "wrong number" report at some point because of NULL semantics: a count that quietly excluded rows, a NOT IN that silently dropped everything, a divide-by-zero hidden behind COALESCE in the wrong place. This guide walks the five mistakes that defeat most analysts, with runnable examples and the rules that actually protect you.

Contents

  1. What NULL actually means
  2. Mistake 1: col = NULL never matches
  3. Mistake 2: NOT IN with a nullable subquery
  4. Mistake 3: COUNT(*) vs COUNT(col)
  5. Mistake 4: NULL contagion in arithmetic
  6. Mistake 5: NULL in LEFT JOIN's WHERE
  7. COALESCE — the everyday workhorse
  8. NULLIF — converting sentinels
  9. Three-valued logic in one paragraph
  10. The NULL checklist
  11. Practice with real challenges

What NULL actually means

NULL is not zero, not empty string, not false. NULL means "unknown." A row's email column being NULL says "we don't have that person's email" — not "their email is empty." This sounds pedantic but it's the foundation of every NULL bug: the engine treats NULL as a third logical value, distinct from true and false, called unknown.

Three rules follow from that:

Mistake 1: col = NULL never matches

This is the canonical NULL bug. Every SQL beginner writes it once, gets confused by an empty result, then learns the lesson the hard way.

⚠ Returns nothing, even if rows have NULL email
SELECT name FROM users WHERE email = NULL;

Why: email = NULL evaluates to unknown for every row, including ones where email is genuinely NULL. Unknown is not true, so WHERE excludes them. The fix:

✅ Use IS NULL
SELECT name FROM users WHERE email IS NULL;
SELECT name FROM users WHERE email IS NOT NULL;

The same trap applies to !=, <, >, BETWEEN, IN, and pattern-match operators like LIKE. They all propagate NULL, all silently exclude.

Spotting the bug in code review. Anywhere you see = NULL or != NULL, it's a bug. Always. Even if the column has no NULLs today, it might tomorrow — and the query will start dropping rows without changing.

Mistake 2: NOT IN with a nullable subquery

This bug is more subtle than = NULL because the query looks right. It only fails when the subquery happens to contain a NULL.

⚠ Returns nothing if any order has customer_id = NULL
-- "Find customers who have never ordered"
SELECT name FROM customers
WHERE id NOT IN (SELECT customer_id FROM orders);

Why: NOT IN (...) logically expands to != a AND != b AND != c AND .... If any value in that list is NULL, that comparison becomes unknown. Unknown ANDed with anything stays unknown. The whole WHERE clause becomes unknown for every customer, and every row drops.

The bug is invisible until production data has its first NULL. Two safe alternatives:

✅ Pattern A: NOT EXISTS
SELECT name FROM customers c
WHERE NOT EXISTS (
  SELECT 1 FROM orders o WHERE o.customer_id = c.id
);
✅ Pattern B: explicitly filter NULL out of the subquery
SELECT name FROM customers
WHERE id NOT IN (
  SELECT customer_id FROM orders WHERE customer_id IS NOT NULL
);

NOT EXISTS is generally safer and faster — most query optimizers can short-circuit NOT EXISTS as soon as they find any matching row, while NOT IN often forces a full materialization of the subquery. Default to NOT EXISTS unless you have a specific reason for NOT IN.

Mistake 3: COUNT(*) vs COUNT(col)

These look similar, behave differently, and the difference matters for correctness.

Three different counts
SELECT
  COUNT(*) AS total_rows,           -- every row, including NULL emails
  COUNT(email) AS with_email,       -- non-NULL emails only
  COUNT(DISTINCT email) AS unique_emails
FROM users;

Three different numbers are possible. If you have 100 users, 80 with emails (one duplicate), the result is 100, 80, 79. Pick the wrong COUNT variant and you ship a wrong report.

Same principle for other aggregates:

Mistake 4: NULL contagion in arithmetic

Arithmetic operations involving NULL produce NULL. 5 + NULL is NULL. price * quantity with quantity NULL is NULL. String concatenation: 'hello ' || NULL is NULL in standard SQL (some engines treat it as 'hello' instead — Oracle, for example).

⚠ Customers without a discount get NULL net_price, not full price
SELECT name, price - discount AS net_price
FROM orders;

If discount is sometimes NULL (no discount applied), price - discount returns NULL — not the original price. Fix with COALESCE:

✅ Treat missing discount as zero
SELECT name, price - COALESCE(discount, 0) AS net_price
FROM orders;

The same fix applies to string concatenation, division, multiplication, modulo — anywhere a NULL operand might silently turn the whole expression to NULL.

Mistake 5: NULL in LEFT JOIN's WHERE

This is the same NULL trap, hidden inside a JOIN. We covered it in detail in the JOINs guide, but it deserves a slot here because the underlying cause is purely NULL semantics.

⚠ Customers without a paid order vanish
SELECT c.name, o.amount
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id
WHERE o.status = 'paid';

For customers with no matching order, the LEFT JOIN produces a row with o.status = NULL. Then NULL = 'paid' evaluates to unknown, and WHERE drops the row. The LEFT JOIN's whole point — keeping unmatched left rows — is silently defeated.

✅ Filter inside ON, not WHERE
SELECT c.name, o.amount
FROM customers c
LEFT JOIN orders o
  ON o.customer_id = c.id AND o.status = 'paid';

COALESCE — the everyday workhorse

COALESCE returns the first non-NULL argument. It's the everyday tool for providing defaults, filling gaps, and stopping NULL from contaminating downstream math.

Common COALESCE patterns
-- Default name when nickname is missing
COALESCE(nickname, first_name, 'Unknown')

-- Treat missing discount as zero
COALESCE(discount, 0)

-- Fill NULLs from a LEFT JOIN with sensible defaults
COALESCE(o.amount, 0) AS revenue,
COALESCE(o.created_at, date('now')) AS last_seen

COALESCE accepts any number of arguments. It returns NULL only if all of them are NULL. It's also smart about types — most engines promote arguments to a common type so COALESCE(int_col, 0.0) returns a real number cleanly.

COALESCE vs ISNULL. Both exist in SQL Server, only COALESCE is standard. Use COALESCE — it's portable, accepts more arguments, and handles types better. ISNULL is two-argument-only and Microsoft-specific.

NULLIF — converting sentinels

NULLIF is COALESCE's mirror. It returns NULL if the two arguments are equal, else returns the first. The classic use is avoiding divide-by-zero by turning 0 into NULL — division by NULL returns NULL, not an error.

Safe division
-- Errors when total_orders is 0:
SELECT revenue / total_orders AS avg_order_value FROM ...

-- Returns NULL when total_orders is 0:
SELECT revenue / NULLIF(total_orders, 0) AS avg_order_value FROM ...

The reason this works: most engines raise a runtime error on division by zero, but division by NULL is well-defined as NULL. The pattern x / NULLIF(y, 0) turns potential errors into NULLs that downstream code can handle (often combined with COALESCE: COALESCE(x / NULLIF(y, 0), 0) for "ratio, defaulting to zero when undefined").

Other NULLIF uses:

Three-valued logic in one paragraph

Standard programming languages have two-valued logic: true and false. SQL has three-valued logic: true, false, and unknown. NULL comparisons return unknown. The WHERE clause keeps a row only if its predicate is true; unknown rows are dropped. NOT (unknown) is unknown. true OR unknown is true (because true ORed with anything is true). false AND unknown is false (because false ANDed with anything is false). Everything else involving unknown is unknown. That's the entire system. Nearly every NULL bug traces back to one of these rules.

The NOT (unknown) trap. If you have a NOT condition wrapping a comparison that might be NULL, the result is still unknown — not true. WHERE NOT (status = 'closed') excludes rows where status is NULL, even though "NULL is not closed" feels like it should be true. Always combine with explicit IS NULL: WHERE status != 'closed' OR status IS NULL.

The NULL checklist

Before shipping any query that touches a nullable column, walk this checklist:

  1. Have I used = or != against NULL? If yes, replace with IS NULL / IS NOT NULL.
  2. Have I used NOT IN with a subquery? If yes, switch to NOT EXISTS or filter NULL out of the subquery.
  3. Am I doing arithmetic on a column that might be NULL? If yes, wrap with COALESCE or be sure the NULL output is what you want.
  4. Am I dividing by a column that might be zero or NULL? If yes, use NULLIF on the divisor.
  5. Am I filtering on a right-side column in a LEFT JOIN's WHERE? If yes, move it into ON.
  6. Does my COUNT need every row or just non-NULL ones? COUNT(*) for every row, COUNT(col) for non-NULL.
  7. Does my AVG need to treat NULL as zero? If yes, use AVG(COALESCE(col, 0)) instead.

Practice with real challenges

NULL bugs are the hardest to catch in code review because the queries don't error — they silently return wrong data. The only real cure is encountering them on practice problems where the grader actually checks NULL handling.

Catch NULL bugs before production does

Every challenge runs against real data with NULL values mixed in. The diagnostic engine flags NULL handling specifically — not just "incorrect" but "you're missing the NULL rows."

Practice NULL Handling — Free →