Why does `column = NULL` silently filter to zero rows, and does AI catch it?

The item shows a query that filters WHERE status = NULL and asks the candidate to predict the result and recognize whether an AI-suggested fix is correct. The competence under test is the candidate’s grasp of three-valued logic — SQL’s distinctive NULL handling — and whether they trust an AI assistant’s output blindly when it touches NULL semantics. This trap matters because the broken query returns zero rows silently rather than raising an error, and the bug is invisible in code review unless the reviewer knows the rule.

What this question tests

The concept being tested is the difference between equality comparison (=) and null-predicate testing (IS NULL / IS NOT NULL) under SQL’s three-valued logic. In standard SQL, comparing any value to NULL using =, <>, <, >, or any other comparison operator yields UNKNOWN, not TRUE or FALSE. A WHERE clause keeps a row only when the predicate evaluates to TRUE; rows where the predicate evaluates to UNKNOWN (or FALSE) are filtered out. The practical consequence is that WHERE status = NULL returns zero rows on every query, regardless of how many rows in the table actually have a NULL status.

AI assistants have absorbed this rule in the abstract but apply it inconsistently in generated code. They will frequently write column = NULL when the user’s natural-language prompt says “where status is missing” or “where status equals null,” treating the English phrasing as if it mapped to the SQL operator directly. The probe tests whether the candidate spots the bug and whether they trust the AI’s auto-fix when the AI proposes one.

Why this is the right answer

Consider an AI-generated filter:

-- AI-generated: silently filters to zero rows
SELECT order_id, customer_id, status
FROM orders
WHERE status = NULL;

This query parses, executes without error, and returns an empty result set. The fix is to use the null-predicate:

-- Corrected: IS NULL evaluates to TRUE for NULL, FALSE
-- otherwise — never UNKNOWN
SELECT order_id, customer_id, status
FROM orders
WHERE status IS NULL;

The right answer recognizes both that = NULL is the bug and that IS NULL is the canonical fix under ANSI SQL. A subtler case involves the negation: candidates who know IS NULL often still write WHERE status <> NULL to mean “status is anything but NULL,” which has the same UNKNOWN problem. The correct form is:

-- Excludes rows with NULL status
SELECT order_id, customer_id, status
FROM orders
WHERE status IS NOT NULL;

And to express “status is set, and is something other than ‘cancelled’” — a common real-world filter — both predicates are needed:

SELECT order_id, customer_id, status
FROM orders
WHERE status IS NOT NULL
  AND status <> 'cancelled';

A third dialect-aware case: SQL Server’s legacy SET ANSI_NULLS OFF setting makes = NULL evaluate to TRUE for NULL values, mimicking pre-standard behavior. This setting is deprecated; Microsoft documents that future versions will always behave as if ANSI_NULLS ON. AI assistants sometimes generate code that depends on ANSI_NULLS OFF semantics — code that works on one connection and silently misbehaves on another. The right answer recognizes the ANSI_NULLS complication as a pitfall, not as license to write = NULL.

What the wrong answers reveal

The three incorrect options each map to a common but mistaken mental model:

“column = NULL returns rows where the column is NULL, the same as IS NULL.” This is the most common misreading and reflects a candidate who has not internalized three-valued logic. Many programming languages outside SQL treat null == null as true (JavaScript) or as a runtime error (Java when not autoboxed); SQL uniquely treats it as UNKNOWN. Respondents picking this option are mapping application-language semantics onto SQL.
“column = NULL raises a syntax error and the query fails at parse time.” This is what some other typed languages do when comparing a value to a literal null. SQL does not — the comparison parses and runs, returning UNKNOWN for every row. The query produces zero results, not an error. Respondents picking this option expect louder failure modes than SQL provides.
“column = NULL works correctly when ANSI_NULLS is set to OFF, so it’s safe to use in SQL Server.” Even though this is technically true on legacy SQL Server configurations, it’s a bad answer because (a) the behavior is deprecated and Microsoft has stated it will eventually always behave as ANSI_NULLS ON, (b) most modern code cannot assume the setting, and (c) writing portable SQL requires using the standard IS NULL predicate. Respondents picking this option have absorbed a vendor-specific gotcha but missed the portability and forward-compatibility argument.

The wrong-answer patterns cluster into three distinct gaps: unfamiliarity with three-valued logic, expectation of louder errors, and over-reliance on legacy vendor settings.

How the sample test scores you

In the AIEH 5-question AI-Augmented SQL sample, this item contributes one of five datapoints aggregated into a single ai_augmented_sql score via the W3.2 normalize-by-count threshold. Binary scoring per item: 5 for the correct option, 1 for any of the three wrong options. With 5 binary items, the average ranges 1–5 and the level threshold maps avg ≤ 2 to low, ≤ 4 to mid, > 4 to high.

Data Notice: Sample-test results are directional indicators only. A 5-question sample can’t reliably distinguish “knows SQL NULL semantics” from “got lucky on these specific items”; for a verified Skills Passport credential, take the full 50-question assessment.

The full assessment probes NULL semantics across JOIN predicates, GROUP BY grouping behavior (NULLs as their own group), aggregate function NULL handling, and NOT IN versus NOT EXISTS interactions with NULL — the most common NULL-related production bugs. See the scoring methodology for how AI-Augmented SQL scores map onto the AIEH 300–850 Skills Passport scale.

NOT IN and NULL. A subquery that returns any NULL value causes WHERE x NOT IN (subquery) to evaluate to UNKNOWN for every row, returning zero results. The fix is NOT EXISTS, which uses two-valued logic at the row level and doesn’t propagate NULLs the same way. AI assistants frequently emit NOT IN when NOT EXISTS is the safer idiom; reviewing for this pattern is part of reading AI-generated SQL critically.
COALESCE and NULLIF as NULL-handling tools. COALESCE(x, default) returns the first non-NULL argument; NULLIF(a, b) returns NULL when a = b, else a. Both are ANSI-standard and behave consistently across dialects, in contrast to vendor-specific ISNULL / IFNULL functions that differ subtly in argument order and type coercion.
NULL in GROUP BY and DISTINCT. SQL treats NULLs as equal to each other for grouping and deduplication purposes, even though NULL = NULL evaluates to UNKNOWN. This asymmetry — NULLs are unequal under = but equal under GROUP BY — is one of the language’s stranger corners and is a frequent source of AI-generated bugs.
NULLs in foreign-key columns and outer joins. A LEFT JOIN produces NULL on the right side for unmatched outer rows, and a subsequent WHERE right.column = 'x' filters those rows out — silently converting the LEFT JOIN into an INNER JOIN. Recognizing this pattern in AI-generated SQL is part of NULL-aware code review.

For the broader AI-Augmented SQL lineup, see the tests catalog. For how SQL fluency intersects with broader assessment design, see ai fluency in hiring and the backend engineering interview prep guide.

Sources

ISO/IEC 9075-2:2023. Information technology — Database languages — SQL — Part 2: Foundation (SQL/Foundation). International Organization for Standardization. — Three-valued logic specified in clause 6.34 (boolean value expression) and null predicate in clause 8.8.
Microsoft. (2024). SQL Server Documentation: SET ANSI_NULLS (Transact-SQL). https://learn.microsoft.com/en-us/sql/t-sql/statements/set-ansi-nulls-transact-sql
PostgreSQL Global Development Group. (2024). PostgreSQL Documentation: Comparison Operators. https://www.postgresql.org/docs/current/functions-comparison.html
Date, C. J. (2008). SQL and Relational Theory: How to Write Accurate SQL Code (3rd ed.). O’Reilly. — Chapter on NULL handling and three-valued logic remains the canonical reference for the semantic issues exercised by this item.

What this question tests

Why this is the right answer

What the wrong answers reveal

How the sample test scores you

Related concepts

Sources

Try the question yourself