SQL Window Functions Tutorial — RANK, LAG, Running Totals (2026)
Window functions appear in roughly 80% of mid-to-senior data-interview screens, and most candidates under-prepare. They're harder than JOINs and aggregates because they introduce a third way of thinking about rows — neither one row at a time, nor collapsed into groups, but each row with a calculation drawn from a window of its neighbors. Get the mental model right and the syntax falls into place.
Contents
- Why window functions are special
- Anatomy of a window function
- Ranking — ROW_NUMBER, RANK, DENSE_RANK
- PARTITION BY — windows per group
- LAG and LEAD — neighbor lookup
- Window frames — ROWS BETWEEN
- Running totals and rolling averages
- Top-N per group (the most-asked pattern)
- Why you can't filter on a window function in WHERE
- Performance notes
- Five interview patterns to memorize
- Practice with real challenges
Why window functions are special
Aggregates collapse rows. SUM(amount) FROM orders GROUP BY customer_id takes a thousand orders and produces one row per customer — the input rows are gone. Window functions don't collapse. They run an aggregate-like calculation over a "window" of rows, but every input row stays in the result, with the calculation attached as a new column.
That single property unlocks most of the patterns interviewers care about: ranking within a group, comparing each row to the previous one, computing a running total without losing per-day detail, finding the top N rows in each category. None of those are clean to express with regular GROUP BY because they all require keeping the underlying rows visible.
Anatomy of a window function
Every window function has the same shape:
function_name(arguments) OVER (
[PARTITION BY col, ...]
[ORDER BY col [ASC|DESC], ...]
[ROWS BETWEEN ... AND ...]
)
Three things to define:
- The function — could be a ranking function (
ROW_NUMBER,RANK), a positional function (LAG,LEAD,FIRST_VALUE,LAST_VALUE), or any aggregate (SUM,AVG,COUNT) made windowed by addingOVER. - The partition — optional groupings, like running calculations per department, per customer, per category. Skip
PARTITION BYand the window covers all rows. - The order and frame — for functions that care about row position (ranking, LAG/LEAD, running totals),
ORDER BYdefines the sequence andROWS BETWEENdefines how many neighboring rows feed the calculation.
Ranking — ROW_NUMBER, RANK, DENSE_RANK
Three ranking functions exist because they handle ties differently. Use the table below to choose:
SELECT name, salary,
ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num,
RANK() OVER (ORDER BY salary DESC) AS rnk,
DENSE_RANK() OVER (ORDER BY salary DESC) AS dense
FROM employees;
Imagine three employees with $180k, $170k, $170k, $160k salaries. The output:
| name | salary | row_num | rnk | dense |
|---|---|---|---|---|
| Alice | 180000 | 1 | 1 | 1 |
| Bob | 170000 | 2 | 2 | 2 |
| Carol | 170000 | 3 | 2 | 2 |
| Dan | 160000 | 4 | 4 | 3 |
Read the differences:
- ROW_NUMBER: every row gets a unique number. Bob and Carol both earn $170k but get 2 and 3. The tie-breaking is arbitrary — usually physical row order — so two runs of the same query may swap them. Use ROW_NUMBER when you need a deterministic top-N and don't care about tie semantics.
- RANK: ties share a rank, then the next non-tie skips ahead. Bob and Carol both get 2; Dan gets 4 (because two rows occupied rank 2). Use RANK when ties should "use up" rank slots — like Olympic medals.
- DENSE_RANK: ties share a rank, no gaps. Bob and Carol both get 2; Dan gets 3. Use DENSE_RANK when you want rank slots to be sequential regardless of ties — like quartile bucketing.
PARTITION BY — windows per group
Without PARTITION BY, the window covers the entire result set — one global ranking, one global running total. Adding PARTITION BY resets the window per group.
SELECT name, department, salary,
ROW_NUMBER() OVER (
PARTITION BY department
ORDER BY salary DESC
) AS rk_in_dept
FROM employees;
Each department now has its own ranking starting from 1. The highest-paid engineer is rank 1 in Engineering even if they earn less than the highest-paid Sales rep — because Sales has its own rank-1 slot.
Mental model: PARTITION BY is like GROUP BY but doesn't collapse rows. Same grouping behavior, different output cardinality.
LAG and LEAD — neighbor lookup
LAG and LEAD let a row reference its neighbors. LAG(col, n) returns the value of col from n rows earlier in the window. LEAD(col, n) looks n rows forward. Default offset is 1.
SELECT ticker, day, price,
LAG(price, 1) OVER (
PARTITION BY ticker ORDER BY day
) AS prev_price,
price - LAG(price, 1) OVER (
PARTITION BY ticker ORDER BY day
) AS daily_delta
FROM stock_prices;
For the first row in each ticker's window, there is no previous row — LAG returns NULL. You can supply a default: LAG(price, 1, 0) returns 0 instead of NULL for boundary rows. Use the default when downstream math chokes on NULL; otherwise NULL is a useful signal that this is a boundary.
Foundational uses for LAG/LEAD:
- Day-over-day deltas, week-over-week growth — anywhere a row needs context from its previous row.
- Sessionization — group user events into sessions by detecting gaps.
LAG(event_time)tells you when the previous event was; if the gap exceeds 30 minutes, this row starts a new session. - Churn detection — find the last activity date per user and compare to today.
- Detecting consecutive events — was this purchase made on the day right after the previous one?
LAG(purchase_date) = purchase_date - 1.
Window frames — ROWS BETWEEN
The frame defines which rows feed an aggregate window function. The default frame is engine-dependent and not always what you want — be explicit.
-- Up to and including current row (running total)
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
-- Last 7 rows including current (rolling 7-row window)
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
-- Centered: 3 before + current + 3 after
ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING
-- Future-only: today through next 7 days
ROWS BETWEEN CURRENT ROW AND 7 FOLLOWING
ROWS vs RANGE. ROWS BETWEEN counts physical rows. RANGE BETWEEN uses the ORDER BY column's values, which behaves unexpectedly with ties — multiple rows that share a date count as a single position. Default in PostgreSQL is RANGE, in SQLite it's RANGE for unbounded frames. Use ROWS unless you have a specific reason for RANGE.
Running totals and rolling averages
Two of the most-asked window patterns. Both use aggregate functions over an explicit frame.
Running total
SELECT day,
daily_revenue,
SUM(daily_revenue) OVER (
ORDER BY day
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS cumulative_revenue
FROM daily_sales
ORDER BY day;
Each row's cumulative_revenue equals the sum of every prior day's revenue plus today's. To compute it per category instead of globally, add PARTITION BY category.
Rolling average
SELECT ticker, day, price,
AVG(price) OVER (
PARTITION BY ticker
ORDER BY day
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS ma_7d
FROM stock_prices;
For the first 6 rows of each ticker, the window includes fewer than 7 rows — the first row's MA is just itself, the second is the average of rows 1 and 2, and so on. Some queries need a "warmup" filter: WHERE day >= start_date + 6 days to drop the partial windows. Decide based on whether partial values are misleading for your use case.
Top-N per group (the most-asked pattern)
If you take only one window-function pattern from this guide, take this one. It appears in nearly every interview that asks any window-function question, and the variations are mostly cosmetic.
WITH ranked AS (
SELECT name, department, salary,
ROW_NUMBER() OVER (
PARTITION BY department
ORDER BY salary DESC
) AS rk
FROM employees
)
SELECT name, department, salary
FROM ranked
WHERE rk <= 3
ORDER BY department, salary DESC;
Three components to memorize:
- Compute the rank inside a CTE. The rank is partitioned by the group column and ordered by the value you're ranking.
- Filter on the rank in the outer query. You cannot filter on the window function in the same WHERE — that's the next section.
- Pick the right ranker. ROW_NUMBER for an exact top-N (no ties at position N). RANK or DENSE_RANK if ties at the boundary should all be included.
Variants the interviewer may ask: top-N most-active customers per region, top-N longest-tenured employees per department, top-N highest-rated movies per genre, top-N best-selling products per category. Same skeleton every time.
Why you can't filter on a window function in WHERE
Try writing this query directly:
SELECT name, salary
FROM employees
WHERE ROW_NUMBER() OVER (ORDER BY salary DESC) <= 3;
-- Error: window functions are not allowed in WHERE
The error is the engine being explicit about a logical-execution-order rule. SQL processes a query in this order:
FROM+ JOINs — assemble the source rowsetWHERE— filter rowsGROUP BY— group what's leftHAVING— filter groups- Window functions execute
SELECT— project columnsORDER BYLIMIT
Window functions run after WHERE, so referencing one in WHERE means it doesn't exist yet. The fix: wrap the windowed query in a CTE or subquery, then filter in the outer query where the window function alias is available.
The same constraint applies to GROUP BY and HAVING. You can't group by or filter on a window function alias either, because they run later. Always wrap and filter in the outer query.
Performance notes
Window functions are usually faster than equivalent self-joins or correlated subqueries — engines have optimized window evaluation extensively. A few things to know:
- PARTITION BY uses sorting. If you partition by a column without an index that matches, the engine has to sort the entire result set. For large tables, an index on (partition_col, order_col) can drop a window query's runtime by 10-100×.
- Multiple window functions over the same OVER clause share work. Three rankings with identical PARTITION BY and ORDER BY only sort once. Use a named window if the OVER clause is repeated:
SELECT name, ROW_NUMBER() OVER w AS rn, RANK() OVER w AS rk FROM employees WINDOW w AS (PARTITION BY dept ORDER BY salary DESC); - Avoid windowing the entire table when you only need the top-N. If you want the top 3 employees globally, an
ORDER BY salary DESC LIMIT 3is much cheaper than ranking every row.
Five interview patterns to memorize
1. Top-N per group
Already covered. The skeleton: ROW_NUMBER + PARTITION BY + filter rk <= N in outer query.
2. Day-over-day delta
SELECT day, value,
value - LAG(value) OVER (ORDER BY day) AS delta
FROM daily_metrics;3. Running total
SUM(amount) OVER (
PARTITION BY customer
ORDER BY day
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
)4. Sessionization (gap detection)
SELECT user_id, event_time,
CASE WHEN event_time - LAG(event_time) OVER (
PARTITION BY user_id ORDER BY event_time
) > interval '30 minutes' THEN 1 ELSE 0 END AS new_session
FROM events;Then a SUM-over-window on new_session assigns a session ID to each event. Classic two-step pattern.
5. Cohort retention
WITH first_visit AS (
SELECT user_id, MIN(visit_date) AS cohort_date
FROM visits GROUP BY user_id
)
SELECT v.user_id, f.cohort_date,
v.visit_date - f.cohort_date AS days_since_first
FROM visits v
JOIN first_visit f USING (user_id);Not strictly a window function, but the pattern often combines with windows for "what % of cohort returned in week N" calculations.
Practice with real challenges
Window functions are pure pattern recognition. Reading about them gets you 10% of the way; writing 30+ challenges from cold memory gets you the rest. SQL Quest has 33 window-function-specific challenges, sliced by sub-pattern.
The interview screen wants window functions cold
Each challenge runs in your browser against a real SQLite engine. The AI Coach explains why your query failed — wrong partition, wrong frame, missing tie-handling — instead of just "incorrect."
Practice Window Functions — Free →