Tools · Fairness Audit

Drop in a CSV. Get a fairness audit in under a minute.

Built so every ML engineer can run a basic disparate-impact check before shipping — not just researchers with aequitas notebooks open.

Live audit · Synthetic Loan Approval

1000 rows · seeded

1,000 loan applications with a protected group attribute and a model that scores repayment probability. Disparate-impact patterns are baked in to demonstrate the metrics.

Decision thresholdscore ≥ 0.50

Lower = more permissiveHigher = more selective

Disparate Impact Ratio

Fails 4/5 rule

0.40

Group B is approved at 27.8% vs 68.7% for Group A. Disparate-impact ratio is 0.40 — below the 0.8 EEOC four-fifths threshold. This would constitute prima facie evidence of disparate impact under U.S. employment law.

By the numbers

Approval rate by group

True positive rate (Equal Opportunity)

Demographic parity gap

40.9%

|Approval_A − Approval_B|

Equal opportunity gap

21.0%

|TPR_A − TPR_B|

Equalized odds gap

21.0%

max(TPR gap, FPR gap)

Base-rate difference

36.1%

|P(Y=1|A) − P(Y=1|B)|

How this works

Three metrics. Each captures a different kind of fairness. They are mathematically incompatible in general (Chouldechova, 2017; Kleinberg, Mullainathan, Raghavan, 2016) — so part of the engineering work is picking which one matters for your context.

Disparate Impact Ratio

min(P(Ŷ=1|A=a)) / max(P(Ŷ=1|A=b))

Among each protected group, what fraction get a positive prediction? The ratio between the smaller group rate and the larger should be ≥ 0.8 by the EEOC's four-fifths rule.

EEOC Uniform Guidelines, 1978 §1607.4(D)

Equal Opportunity

P(Ŷ=1 | Y=1, A=a) ≈ P(Ŷ=1 | Y=1, A=b)

Among people who actually deserved a positive outcome, do all groups get one at the same rate? Equal True Positive Rate across groups.

Hardt, Price, Srebro (NeurIPS 2016)

Equalized Odds

TPR equal AND FPR equal across groups

Stricter than Equal Opportunity — also requires equal false-positive rates. The 'mistakes are distributed equally too' constraint.

Hardt, Price, Srebro (NeurIPS 2016)

Why this matters

Most fairness tooling lives in research notebooks. To use it, you install five packages, learn three APIs, and write fifty lines of boilerplate before you can answer the question your VP asked you this morning: “does this model treat groups differently?”

That friction means audits don't happen. Or they happen too late, after the model is already in production. Or they get outsourced to a single “fairness specialist” who doesn't scale.

This tool is engineering hygiene — a 60-second baseline check that any team can run before deployment. It will not replace a real fairness audit. It will catch the obvious failures before they ship.

What's next

▸CSV upload — bring your own predictions. Client-side parsing, nothing leaves your browser.
▸Bundled COMPAS subset (ProPublica's recidivism dataset) and a synthetic hiring example.
▸Exportable model card following Mitchell et al. (2019) — PDF or markdown.
▸ROC-by-group plot and threshold-fairness frontier visualization.