B.A.I.L.I.F.F.: Bias Analysis in Interactive Legal Intelligence & Fairness Framework
Luke Blommesteyn
CUCAI 2026 Proceedings - 2026
Abstract
Most legal AI audits ask whether verdicts look fair. We ask harder: can a system produce fair-looking outcomes while running an unequal trial? We present B.A.I.L.I.F.F., a multi-agent framework auditing procedural and outcome fairness in AI-driven legal proceedings. Three independent agents (Judge, Prosecution, Defense) run adversarial paired trials differing only in defendant name. Bias is estimated via paired tests, hierarchical mixed models, wild cluster bootstrap, and randomization inference. Our central finding is a fairness veneer: conviction rates can appear acceptable while the proceeding itself remains unequal. Across six model families, name swaps yield flip rates of 14.9%--38.3% versus a 1% placebo baseline; stochastic instability is a fairness risk in its own right. Defense agents for non-white defendants face more interruptions and fewer sustained objections, even as aggregate outcomes favor them. Static outcome audits are insufficient; adversarial process-and-outcome auditing should be a required pre-deployment standard. Code: https://github.com/Western-Artificial-Intelligence/ai-law-agents