A real synthetic dataset — 2,165 card transactions with fraud rings injected. Learn the five SQL patterns fraud teams actually ship: 3-sigma anomaly bounds, velocity rules, impossible-travel detection, collusion rings via shared device fingerprint, and recursive CTEs that walk a chargeback chain.
Most SQL practice platforms cover the same surface: SELECT, JOIN, GROUP BY, a window function or two. Almost none teach the SQL fraud analysts actually run on the job — statistical anomaly bounds, time-window velocity checks, geographic mismatch detection, recursive chargeback walks. That gap matters because fraud analyst is one of the highest-paid analytics roles in the market and SQL is consistently the #1 hard skill listed on the job description.
This page gives you 12 free challenges on a real fraud dataset — synthetic but realistic, with fraud rings, collusion patterns, and chargeback chains injected on top of legitimate baseline activity. You write queries that flag the bad ones. The diagnostic tells you which fraud you missed and why your threshold was off.
Every pattern below maps to at least one challenge. Code snippets show the shape of the answer — try the challenge yourself, then compare.
Compute the mean and standard deviation of transaction amounts, then flag any transaction more than 3 standard deviations above the mean. The CTE-plus-CROSS-JOIN pattern is the SQL equivalent of scipy.stats.zscore.
WITH stats AS ( SELECT AVG(amount) AS mu, SQRT(AVG((amount - (SELECT AVG(amount) FROM transactions))* (amount - (SELECT AVG(amount) FROM transactions)))) AS sigma FROM transactions ) SELECT t.* FROM transactions t CROSS JOIN stats s WHERE t.amount > s.mu + 3 * s.sigma;
Self-join transactions on account_id where the timestamp difference is under 5 minutes. julianday() * 86400 converts SQLite date math into seconds. Group, count, threshold.
SELECT a.account_id, COUNT(b.txn_id) AS burst_n FROM transactions a JOIN transactions b ON a.account_id = b.account_id AND b.txn_at BETWEEN a.txn_at AND DATETIME(a.txn_at, '+5 minutes') GROUP BY a.account_id, a.txn_at HAVING COUNT(b.txn_id) >= 5;
Use the LAG window function to get the previous transaction's location for the same account, then compute distance via a degree-distance proxy (~111km per degree). Flag pairs that travelled faster than physically possible.
WITH seq AS ( SELECT account_id, txn_at, lat, lng, LAG(lat) OVER (PARTITION BY account_id ORDER BY txn_at) AS prev_lat, LAG(lng) OVER (PARTITION BY account_id ORDER BY txn_at) AS prev_lng, LAG(txn_at) OVER (PARTITION BY account_id ORDER BY txn_at) AS prev_at FROM transactions ) SELECT * FROM seq WHERE SQRT((lat-prev_lat)*(lat-prev_lat) + (lng-prev_lng)*(lng-prev_lng)) * 111 > 500 AND (julianday(txn_at) - julianday(prev_at)) * 1440 < 25;
Self-join accounts where they share a device_fingerprint, with the a.account_id < b.account_id trick to dedupe pairs. This catches coordinated rings that try to evade per-account limits.
SELECT a.account_id AS account_a, b.account_id AS account_b, a.device_fingerprint FROM accounts a JOIN accounts b ON a.device_fingerprint = b.device_fingerprint AND a.account_id < b.account_id ORDER BY a.device_fingerprint;
Chargebacks reference earlier chargebacks via related_chargeback_id. A recursive CTE walks the chain with a depth guard so you don't infinite-loop on cycles. This is the SQL pattern fraud investigators use to map a coordinated dispute network.
WITH RECURSIVE chain AS ( -- anchor: every chargeback as a possible root SELECT chargeback_id, related_chargeback_id, 0 AS depth FROM chargebacks WHERE related_chargeback_id IS NULL UNION ALL SELECT c.chargeback_id, c.related_chargeback_id, ch.depth + 1 FROM chargebacks c JOIN chain ch ON c.related_chargeback_id = ch.chargeback_id WHERE ch.depth < 5 -- safety guard ) SELECT * FROM chain WHERE depth > 0 ORDER BY depth DESC;
7 additional challenges adapt the same statistical techniques to bank-level data — NPL anomalies (2-sigma), tier-1 capital floor violations, ROA outliers via OFFSET percentile, geographic concentration risk. Same SQL muscle, different domain.
See the FDIC bank track →Four tables modelling a card-payments system across five countries. Fraud is INJECTED on top of legitimate baseline activity so you can verify your queries by comparing detected counts to the known injection counts.
📊 Fraud injection truth table
Generated deterministically with seed 20260503. Re-running the generator produces byte-identical data, so your validation queries always pass against the same expected counts.
scripts/generate-fraud-transactions.js — public domain (CC0). Use for any purpose, including training and commercial work.
Fraud-transactions challenges (#270-274) plus 7 fraud-adjacent FDIC bank risk challenges (#260-266) — same statistical muscle, different domain.
Browser-based · No signup · No credit card
US comp band
(early to mid-career)
BLS-projected demand growth (financial analyst occupations)
SQL on most fraud analyst job descriptions
Common for senior fraud and AML roles
Banks and fintechs lose $48 billion a year to fraud (Javelin 2023). Every dollar of fraud loss avoided pays for an analyst many times over, which is why fraud and AML teams rarely face the same hiring freezes data analytics teams see during downturns.
The work is also unusually intellectually engaging. Fraud is adversarial — patterns shift weekly as fraudsters learn what your detection misses. SQL is the daily tool because that's where the data lives, where the speed is, and where audit trails are easiest. Most fraud analyst job descriptions list SQL as the first required skill, ahead of Excel, ahead of Python, ahead of any specific fraud tool.
Companies hiring fraud analysts in 2026 include card networks (Visa, Mastercard), payment processors (Stripe, Adyen, Block), neobanks (Chime, Revolut, Wise), marketplaces (Amazon, eBay, Shopify), and risk-side teams at every major bank. Job titles vary — Fraud Analyst, Risk Analyst, AML Analyst, Trust & Safety Analyst, Payments Risk Analyst — but the SQL pattern requirements are nearly identical.
12 free challenges. Real synthetic fraud dataset. Your first query in 60 seconds.
Start Practicing — Free ⚡Browser-based. No signup. SQLite-WASM runs on your machine.