Live fraud-detection patterns · 12 free challenges

Where SQL stops a bad transaction.
Practice the patterns fraud analysts run every day.

A real synthetic dataset — 2,165 card transactions with fraud rings injected. Learn the five SQL patterns fraud teams actually ship: 3-sigma anomaly bounds, velocity rules, impossible-travel detection, collusion rings via shared device fingerprint, and recursive CTEs that walk a chargeback chain.

Start Practicing — Free See the patterns ↓
2,165
Transactions
200
Accounts
10
Fraud rings
76
Chargebacks
12
Free challenges
Why this exists

The SQL no other practice site teaches

Most SQL practice platforms cover the same surface: SELECT, JOIN, GROUP BY, a window function or two. Almost none teach the SQL fraud analysts actually run on the job — statistical anomaly bounds, time-window velocity checks, geographic mismatch detection, recursive chargeback walks. That gap matters because fraud analyst is one of the highest-paid analytics roles in the market and SQL is consistently the #1 hard skill listed on the job description.

This page gives you 12 free challenges on a real fraud dataset — synthetic but realistic, with fraud rings, collusion patterns, and chargeback chains injected on top of legitimate baseline activity. You write queries that flag the bad ones. The diagnostic tells you which fraud you missed and why your threshold was off.

The five patterns

What you'll actually practice

Every pattern below maps to at least one challenge. Code snippets show the shape of the answer — try the challenge yourself, then compare.

Pattern 1 · Anomaly bounds

3-sigma transaction outliers

Compute the mean and standard deviation of transaction amounts, then flag any transaction more than 3 standard deviations above the mean. The CTE-plus-CROSS-JOIN pattern is the SQL equivalent of scipy.stats.zscore.

WITH stats AS (
  SELECT AVG(amount) AS mu,
         SQRT(AVG((amount - (SELECT AVG(amount) FROM transactions))*
              (amount - (SELECT AVG(amount) FROM transactions)))) AS sigma
  FROM transactions
)
SELECT t.* FROM transactions t CROSS JOIN stats s
WHERE t.amount > s.mu + 3 * s.sigma;
Pattern 2 · Velocity rules

5+ transactions in 5 minutes

Self-join transactions on account_id where the timestamp difference is under 5 minutes. julianday() * 86400 converts SQLite date math into seconds. Group, count, threshold.

SELECT a.account_id, COUNT(b.txn_id) AS burst_n
FROM transactions a
JOIN transactions b
  ON a.account_id = b.account_id
 AND b.txn_at BETWEEN a.txn_at
   AND DATETIME(a.txn_at, '+5 minutes')
GROUP BY a.account_id, a.txn_at
HAVING COUNT(b.txn_id) >= 5;
Pattern 3 · Impossible travel

Geographic mismatch via LAG

Use the LAG window function to get the previous transaction's location for the same account, then compute distance via a degree-distance proxy (~111km per degree). Flag pairs that travelled faster than physically possible.

WITH seq AS (
  SELECT account_id, txn_at, lat, lng,
    LAG(lat) OVER (PARTITION BY account_id ORDER BY txn_at) AS prev_lat,
    LAG(lng) OVER (PARTITION BY account_id ORDER BY txn_at) AS prev_lng,
    LAG(txn_at) OVER (PARTITION BY account_id ORDER BY txn_at) AS prev_at
  FROM transactions
)
SELECT * FROM seq
WHERE SQRT((lat-prev_lat)*(lat-prev_lat) +
            (lng-prev_lng)*(lng-prev_lng)) * 111 > 500
  AND (julianday(txn_at) - julianday(prev_at)) * 1440 < 25;
Pattern 4 · Collusion ring

Shared device fingerprint

Self-join accounts where they share a device_fingerprint, with the a.account_id < b.account_id trick to dedupe pairs. This catches coordinated rings that try to evade per-account limits.

SELECT a.account_id AS account_a,
       b.account_id AS account_b,
       a.device_fingerprint
FROM accounts a
JOIN accounts b
  ON a.device_fingerprint = b.device_fingerprint
 AND a.account_id < b.account_id
ORDER BY a.device_fingerprint;
Pattern 5 · Chargeback chain

Recursive CTE walk

Chargebacks reference earlier chargebacks via related_chargeback_id. A recursive CTE walks the chain with a depth guard so you don't infinite-loop on cycles. This is the SQL pattern fraud investigators use to map a coordinated dispute network.

WITH RECURSIVE chain AS (
  -- anchor: every chargeback as a possible root
  SELECT chargeback_id, related_chargeback_id, 0 AS depth
  FROM chargebacks
  WHERE related_chargeback_id IS NULL
  UNION ALL
  SELECT c.chargeback_id, c.related_chargeback_id, ch.depth + 1
  FROM chargebacks c
  JOIN chain ch ON c.related_chargeback_id = ch.chargeback_id
  WHERE ch.depth < 5   -- safety guard
)
SELECT * FROM chain WHERE depth > 0 ORDER BY depth DESC;
Bonus · Adjacent

FDIC bank risk patterns

7 additional challenges adapt the same statistical techniques to bank-level data — NPL anomalies (2-sigma), tier-1 capital floor violations, ROA outliers via OFFSET percentile, geographic concentration risk. Same SQL muscle, different domain.

See the FDIC bank track →
The dataset

Synthetic. Realistic. Inspectable.

Four tables modelling a card-payments system across five countries. Fraud is INJECTED on top of legitimate baseline activity so you can verify your queries by comparing detected counts to the known injection counts.

accounts
200 rows · 10 marked as fraud
account_id · email · signup_at · country · device_fingerprint · ip_block · status
merchants
25 rows · 5 risk tiers
merchant_id · name · category · country · risk_tier
transactions
2,165 rows · with lat/lng
txn_id · account_id · amount · txn_at · merchant_id · lat · lng · status
chargebacks
76 rows · 5x rate on fraud accounts
chargeback_id · txn_id · account_id · merchant_id · reason_code · opened_at · resolved_at · status · related_chargeback_id

📊 Fraud injection truth table

10 fraud accounts
5% of total — your detection rate should land here
15 amount outliers
$2,500-$9,500 vs ~$50 baseline mean
4 velocity bursts
6-10 transactions within 5 minutes
3 geo mismatches
NYC↔Tokyo, London↔Sydney, Istanbul↔Mexico City
5 collusion accounts
shared device dev_x4f2a9b1c7e
5 chained chargebacks
3-deep + branching for recursive CTE

Generated deterministically with seed 20260503. Re-running the generator produces byte-identical data, so your validation queries always pass against the same expected counts.

Data source + license: Fully synthetic. No real PII, no real merchant names, no real card numbers. Generated by scripts/generate-fraud-transactions.js — public domain (CC0). Use for any purpose, including training and commercial work.
12 free challenges

Pick a difficulty. Solve. Repeat.

Fraud-transactions challenges (#270-274) plus 7 fraud-adjacent FDIC bank risk challenges (#260-266) — same statistical muscle, different domain.

Hard#270
3-Sigma Anomaly: Transaction Amount Outliers
CTE + CROSS JOIN to compute mean/sigma, then filter transactions above the 3-sigma threshold. The classic anomaly-detection SQL pattern.
Hard#271
Velocity Rule — 5+ Transactions in 5 Minutes
Self-join with julianday-based time math to find accounts with rapid burst activity. The bread-and-butter velocity check.
Hard#272
Geographic Mismatch — Impossible Travel
LAG over (PARTITION BY account_id ORDER BY txn_at) plus a degree-distance proxy. Catches cards used in two cities within 25 minutes.
Medium#273
Cross-Account Collusion — Shared Device Fingerprint
Self-join accounts on device_fingerprint with a.id < b.id to dedupe pairs. Surfaces coordinated rings hiding behind separate accounts.
Hard#274
Recursive CTE — Chargeback Chain Investigation
Walk a chargeback network via related_chargeback_id with a depth guard. The recursive CTE pattern fraud teams ship in production.
Hard#260
Banks with NPL Above 2-Sigma
Same stats pattern as #270, applied to bank non-performing loan ratios — flag banks with risk profiles outside the 2-sigma envelope.
Hard#261
NPL Quarter-over-Quarter Acceleration
LAG window function for QoQ deltas. Catches risk profile deteriorating fast, the leading indicator of a bank failure.
Medium#262
Tier 1 Capital Below Well-Capitalized Threshold
Filter banks under the 8% regulatory floor. The simplest risk rule — a regulator's first query of the morning.
Medium#263
Failure Clustering — State + Year
CAST + SUBSTR to extract year from a date, then GROUP BY (state, year) to surface contagion clusters during 2008-style waves.
Medium#264
ROA Outliers — Top 5% by Return
Percentile via OFFSET trick (or PERCENT_RANK). Surfaces banks with suspiciously high profitability — sometimes legitimate, sometimes a yellow flag.
Medium#265
Geographic Concentration — Single-State Banks
Self-join + ratio. Banks with all branches in one state carry concentration risk — the technique generalizes to single-merchant or single-IP fraud rings.
Medium#266
Trust Banks vs Others — NPL Profile Comparison
CASE-based segmentation with conditional aggregation. The query shape applies cleanly to fraud-vs-legit account profile comparisons.
Start with #270 — 3-Sigma Anomaly →

Browser-based · No signup · No credit card

Why fraud analytics

One of the highest-paid analytics roles

$95k–$180k

US comp band
(early to mid-career)

~12% YoY

BLS-projected demand growth (financial analyst occupations)

#1 hard skill

SQL on most fraud analyst job descriptions

Hybrid & remote

Common for senior fraud and AML roles

Banks and fintechs lose $48 billion a year to fraud (Javelin 2023). Every dollar of fraud loss avoided pays for an analyst many times over, which is why fraud and AML teams rarely face the same hiring freezes data analytics teams see during downturns.

The work is also unusually intellectually engaging. Fraud is adversarial — patterns shift weekly as fraudsters learn what your detection misses. SQL is the daily tool because that's where the data lives, where the speed is, and where audit trails are easiest. Most fraud analyst job descriptions list SQL as the first required skill, ahead of Excel, ahead of Python, ahead of any specific fraud tool.

Companies hiring fraud analysts in 2026 include card networks (Visa, Mastercard), payment processors (Stripe, Adyen, Block), neobanks (Chime, Revolut, Wise), marketplaces (Amazon, eBay, Shopify), and risk-side teams at every major bank. Job titles vary — Fraud Analyst, Risk Analyst, AML Analyst, Trust & Safety Analyst, Payments Risk Analyst — but the SQL pattern requirements are nearly identical.

Go deeper

Related reading

📝 Tutorial
SQL for Fraud Analytics — 5 patterns explained
Long-form walkthrough of the same patterns with runnable examples and design notes.
📝 Tutorial
Window Functions — LAG, LEAD, ROW_NUMBER, RANK
LAG is the workhorse for impossible-travel detection. Learn the full window family.
📝 Tutorial
Common Table Expressions, Including Recursive CTEs
The chargeback chain pattern relies on a recursive CTE. This tutorial covers the syntax in depth.
📝 Tutorial
GROUP BY — Aggregation, HAVING, the 5 Mistakes
Fraud rules are usually per-account or per-window aggregations. The GROUP BY guide names the patterns and the bugs.
📝 Tutorial
CASE WHEN — Conditional Logic + Pivots
Fraud-rule logic and tier segmentation are CASE-based. Conditional aggregation is the core fraud-rule pattern.
🎯 Interview prep
SQL Interview Prep — full curriculum
If you're preparing for a fraud analyst interview, the broader SQL interview prep page covers the foundational pattern bank.
🏦 Sector track (Türkçe)
Finans & Bankacılık SQL — FDIC verisiyle
The Turkish-language sector landing for finance SQL practice. Same FDIC bank-level dataset.
🏦 Company-specific
JPMorgan SQL Interview Prep
JPMorgan's fraud-analyst SQL interviews lean heavily on the patterns above. Company-specific prep page.
FAQ

Common questions

Catch the fraud the
other queries miss.

12 free challenges. Real synthetic fraud dataset. Your first query in 60 seconds.

Start Practicing — Free ⚡

Browser-based. No signup. SQLite-WASM runs on your machine.