Tools · Fairness Audit
Drop in a CSV. Get a fairness audit in under a minute.
Built so every ML engineer can run a basic disparate-impact check before shipping — not just researchers with aequitas notebooks open.
Live audit · Synthetic Loan Approval
1000 rows · seeded
1,000 loan applications with a protected group attribute and a model that scores repayment probability. Disparate-impact patterns are baked in to demonstrate the metrics.
Disparate Impact Ratio
Fails 4/5 ruleGroup B is approved at 27.8% vs 68.7% for Group A. Disparate-impact ratio is 0.40 — below the 0.8 EEOC four-fifths threshold. This would constitute prima facie evidence of disparate impact under U.S. employment law.
By the numbers
Demographic parity gap
40.9%
|Approval_A − Approval_B|
Equal opportunity gap
21.0%
|TPR_A − TPR_B|
Equalized odds gap
21.0%
max(TPR gap, FPR gap)
Base-rate difference
36.1%
|P(Y=1|A) − P(Y=1|B)|
How this works
Three metrics. Each captures a different kind of fairness. They are mathematically incompatible in general (Chouldechova, 2017; Kleinberg, Mullainathan, Raghavan, 2016) — so part of the engineering work is picking which one matters for your context.
Disparate Impact Ratio
min(P(Ŷ=1|A=a)) / max(P(Ŷ=1|A=b))Among each protected group, what fraction get a positive prediction? The ratio between the smaller group rate and the larger should be ≥ 0.8 by the EEOC's four-fifths rule.
EEOC Uniform Guidelines, 1978 §1607.4(D)
Equal Opportunity
P(Ŷ=1 | Y=1, A=a) ≈ P(Ŷ=1 | Y=1, A=b)Among people who actually deserved a positive outcome, do all groups get one at the same rate? Equal True Positive Rate across groups.
Hardt, Price, Srebro (NeurIPS 2016)
Equalized Odds
TPR equal AND FPR equal across groupsStricter than Equal Opportunity — also requires equal false-positive rates. The 'mistakes are distributed equally too' constraint.
Hardt, Price, Srebro (NeurIPS 2016)
Why this matters
Most fairness tooling lives in research notebooks. To use it, you install five packages, learn three APIs, and write fifty lines of boilerplate before you can answer the question your VP asked you this morning: “does this model treat groups differently?”
That friction means audits don't happen. Or they happen too late, after the model is already in production. Or they get outsourced to a single “fairness specialist” who doesn't scale.
This tool is engineering hygiene — a 60-second baseline check that any team can run before deployment. It will not replace a real fairness audit. It will catch the obvious failures before they ship.
What's next
- ▸CSV upload — bring your own predictions. Client-side parsing, nothing leaves your browser.
- ▸Bundled COMPAS subset (ProPublica's recidivism dataset) and a synthetic hiring example.
- ▸Exportable model card following Mitchell et al. (2019) — PDF or markdown.
- ▸ROC-by-group plot and threshold-fairness frontier visualization.