SQL CTEs Explained — WITH Clause, Recursive CTEs, Subqueries (2026)
CTEs (Common Table Expressions) are the difference between a SQL query you can read and a tangle of nested subqueries you can't. Once they click, you'll write almost every multi-step query as a chain of CTEs and rarely look back. This guide covers the basic WITH clause, when to choose CTE over subquery, multi-step chains, and recursive CTEs for hierarchies and sequence generation.
Contents
- What a CTE actually is
- Basic CTE syntax
- Multiple CTEs in one query
- CTEs vs subqueries — when to choose which
- CTEs vs temporary tables
- Recursive CTEs — the structure
- Walking an org chart
- Generating numbers and dates
- Are CTEs materialized? Performance notes
- Three real-world patterns
- CTE anti-patterns
- Practice with real challenges
What a CTE actually is
A CTE is a named, temporary result set you define at the top of your query and reference in the main SELECT. Syntactically it's the WITH clause; conceptually it's "let me name this intermediate step so the query reads top-to-bottom."
CTEs solve a real readability problem. Without them, a multi-step query becomes nested subqueries — outer queries reference inner queries, you read inside-out, and three levels of nesting is the practical limit before the query becomes unmaintainable. With CTEs, every step gets a name and the query reads as a sequence of operations.
Basic CTE syntax
The simplest possible CTE:
WITH first_class AS (
SELECT id, name, age
FROM passengers
WHERE class = '1st'
)
SELECT name, age FROM first_class
ORDER BY age DESC;
Read it as: "Define first_class as the result of selecting all 1st-class passengers, then select name and age from that result, sorted by age descending." Logically equivalent to a subquery, but easier to read because each step is named.
Three things to note:
WITHgoes before the mainSELECT. The CTE is defined first, then the main query references it.- The CTE has its own column names. You can rename them at definition time:
WITH first_class (passenger_id, full_name, current_age) AS (...)— useful when the inner SELECT uses internal names you want to abstract away. - CTEs are scoped to a single statement. They exist for the duration of the query and disappear. Different statements need their own definitions.
Multiple CTEs in one query
Real CTEs come in chains — one query, several named steps, separated by commas.
WITH spend_2026 AS (
SELECT customer_id, SUM(amount) AS total_spend
FROM orders
WHERE created_at >= '2026-01-01'
GROUP BY customer_id
),
high_spend AS (
SELECT * FROM spend_2026 WHERE total_spend > 10000
),
first_order AS (
SELECT customer_id, MIN(created_at) AS first_order_at
FROM orders
GROUP BY customer_id
)
SELECT c.name, h.total_spend, f.first_order_at
FROM high_spend h
JOIN first_order f USING (customer_id)
JOIN customers c ON c.id = h.customer_id
ORDER BY h.total_spend DESC;
Three CTEs, then a join in the main query. Each CTE is one logical step — compute spend in 2026, filter to high-spend, find first order date — and any CTE can reference any earlier CTE. high_spend uses spend_2026; the main query uses both high_spend and first_order.
The same query without CTEs would be three nested subqueries with two joins, hard to read and almost impossible to debug step-by-step. With CTEs, you can run the query up to a single CTE (most editors let you select-then-run a fragment) to see what each step produces.
CTEs vs subqueries — when to choose which
Almost every multi-step subquery can be rewritten as a CTE and vice versa. The choice is mostly about readability.
| Situation | Choose |
|---|---|
| Single inline filter | Subquery (WHERE id IN (SELECT ...)) |
| Two or more steps | CTE |
| Reusing the same intermediate | CTE |
| Recursion needed | CTE (only option) |
| Very simple query, brevity matters | Subquery |
| Filter on a window function | CTE (subquery works but reads worse) |
The mental shortcut: if you'd want to name what the subquery represents, that's a CTE. If a name would feel like overhead, leave it as a subquery.
Bad subquery vs clean CTE
SELECT name, total_spend, first_order_at FROM (
SELECT c.name, h.total_spend, f.first_order_at
FROM (
SELECT customer_id, SUM(amount) AS total_spend
FROM orders WHERE created_at >= '2026-01-01'
GROUP BY customer_id HAVING SUM(amount) > 10000
) h
JOIN (
SELECT customer_id, MIN(created_at) AS first_order_at
FROM orders GROUP BY customer_id
) f USING (customer_id)
JOIN customers c ON c.id = h.customer_id
) ORDER BY total_spend DESC;
Same query, written above with CTEs, takes the same number of operations. The CTE version reads as a recipe; the nested-subquery version reads as a maze.
CTEs vs temporary tables
For a single query, CTEs win — they're scoped to the statement, no cleanup, usually inlined by the optimizer. Temporary tables win in two scenarios:
- You're computing an expensive intermediate that several separate queries will use. Compute once, query many times. CTEs are re-evaluated per query; a temp table is computed once and stored.
- You're in an ETL script with stateful steps. Each step writes to a temp table, the next step reads from it. The persistence between statements is the point.
For analytics and one-off queries, default to CTEs. Reach for temp tables when you have a clear reason.
Recursive CTEs — the structure
Recursive CTEs walk hierarchies and generate sequences. The structure is always:
- Anchor — the starting row(s).
- Recursive step — joins back to the CTE itself to add more rows.
- Termination — the recursion stops when the recursive step returns no new rows.
Syntactically: WITH RECURSIVE cte_name AS (anchor UNION ALL recursive_step).
WITH RECURSIVE nums(n) AS (
SELECT 1 -- anchor: starting value
UNION ALL
SELECT n + 1 FROM nums WHERE n < 10 -- recursive step + termination
)
SELECT n FROM nums;
How the engine evaluates this:
- Run the anchor: result is one row,
n = 1. - Run the recursive step using the previous result:
SELECT n + 1 FROM nums WHERE n < 10producesn = 2. - Run the recursive step again, now using
n = 2: producesn = 3. - Continue until the WHERE clause stops adding rows. At
n = 10, the recursive step's WHERE rejects, no new row, recursion ends. - Final result: rows 1 through 10.
Always include termination. If the recursive step has no condition that eventually fails, the CTE runs forever. Most engines have a recursion-depth safety net (often 1000 by default), but you'll get an error and a hung query, not a useful result.
Walking an org chart
The classic recursive-CTE problem: an employees table where each row has a manager_id pointing at another employee. Walk the chain to find every employee under a given manager.
WITH RECURSIVE reports AS (
-- Anchor: Alice herself
SELECT id, name, manager_id, 0 AS level
FROM employees
WHERE name = 'Alice'
UNION ALL
-- Recursive: anyone who reports to someone already in the result
SELECT e.id, e.name, e.manager_id, r.level + 1
FROM employees e
JOIN reports r ON e.manager_id = r.id
)
SELECT name, level FROM reports
ORDER BY level, name;
Each iteration finds the next layer down the tree. The level column tracks how many hops from Alice each row is — Alice is 0, her direct reports are 1, their reports are 2, and so on. Termination happens automatically when no new rows exist (no one reports to the latest layer).
Same skeleton works for any tree-shaped data: comment threads where each comment has a parent_id, file-system paths where each file has a parent_dir, friend-of-friend graphs (with cycle detection), product-category hierarchies.
Generating numbers and dates
Recursive CTEs are the standard SQL way to generate sequences when the engine doesn't have a generate_series function (Postgres has it, SQLite and SQL Server don't). Useful for filling in missing dates, building calendar dimension tables, or producing numbered lists.
WITH RECURSIVE calendar(day) AS (
SELECT date('2026-01-01')
UNION ALL
SELECT date(day, '+1 day') FROM calendar
WHERE day < date('2026-12-31')
)
SELECT day FROM calendar;
Combine with a LEFT JOIN to fill in missing days in a sales report:
WITH RECURSIVE calendar(day) AS (
SELECT date('2026-01-01')
UNION ALL
SELECT date(day, '+1 day') FROM calendar
WHERE day < date('now')
)
SELECT c.day,
COALESCE(SUM(o.amount), 0) AS daily_revenue
FROM calendar c
LEFT JOIN orders o ON date(o.created_at) = c.day
GROUP BY c.day
ORDER BY c.day;
Without the calendar CTE, days with zero sales would simply be missing from the result. The recursive CTE provides a complete date axis to LEFT JOIN against.
Are CTEs materialized? Performance notes
Modern engines treat most CTEs as inlined views — the optimizer pushes filters down into the CTE definition and merges it with the main query. Performance is usually identical to the equivalent subquery.
Two important exceptions to know:
- Postgres ≤ 11 treated every CTE as an "optimization fence" — it was always materialized to a temporary result, with no pushdown. This made CTEs slower than subqueries in some cases. Postgres 12+ inlines by default. If you're on an older Postgres, benchmark.
- Explicit MATERIALIZED keyword. Postgres 12+ lets you force materialization with
WITH cte_name AS MATERIALIZED (...). Useful when you reference the CTE in many places and the optimizer's inlining produces a worse plan than computing once and reusing.
For day-to-day analytics work, write CTEs for readability and trust the optimizer. If performance is a problem, use EXPLAIN ANALYZE to see what the engine is doing — it usually inlines, sometimes materializes, and you'll see which.
Three real-world patterns
1. Top-N per group via CTE
The most-asked window-function pattern uses a CTE because you can't filter on a window function in the same WHERE.
WITH ranked AS (
SELECT name, dept, salary,
ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC) AS rk
FROM employees
)
SELECT * FROM ranked WHERE rk <= 3;2. Cohort analysis
First-touch CTE establishes the cohort, then a second CTE computes retention.
WITH first_visit AS (
SELECT user_id, date_trunc('month', MIN(visit_date)) AS cohort_month
FROM visits GROUP BY user_id
),
monthly_activity AS (
SELECT v.user_id, f.cohort_month,
date_trunc('month', v.visit_date) AS active_month
FROM visits v JOIN first_visit f USING (user_id)
)
SELECT cohort_month, active_month,
COUNT(DISTINCT user_id) AS retained_users
FROM monthly_activity
GROUP BY cohort_month, active_month;3. Decomposing a complex query
Whenever a query has more than three logical steps — filter, aggregate, join, filter again — break each step into its own CTE. The query takes the same number of lines but becomes reviewable.
CTE anti-patterns
Defining a CTE that's used once and trivial
If your CTE is SELECT * FROM table WHERE simple_condition and you reference it once, the CTE is overhead. Inline it as a subquery or filter directly in the main query.
Excessive CTE chains
Six CTEs that could collapse into two or three are overkill. Each CTE should represent one meaningful logical step — not every SELECT * deserves its own name.
Mutating in CTEs (where supported)
Postgres lets you write WITH inserted AS (INSERT ... RETURNING ...). Useful but easy to misuse. Side-effecting CTEs are hard to reason about — keep them rare and document them.
Recursing without a depth limit
Always test recursion termination on a small dataset first. A bug that produces infinite recursion can lock a database connection until the engine's safety net trips, by which point you've burned compute.
Practice with real challenges
CTEs click fast — usually within 2-3 challenges, the pattern becomes muscle memory. Recursive CTEs take longer because the mental model is genuinely different. SQL Quest's CTE challenges include both flavors.
Read about CTEs is the easy part. Now write them.
Each challenge runs in your browser. The Coach explains why a CTE chain is wrong — missing recursive termination, wrong column projection, joins to the wrong CTE — instead of "incorrect."
Practice CTEs — Free →