Rawlsian Fairness for Machine Learning

Matthew Joseph, Michael Kearns, Jamie Morgenstern, Seth Neel, Aaron Roth

Motivated by concerns that automated decision-making procedures can unintentionally lead to discriminatory behavior, we study a technical definition of fairness modeled after John Rawls' notion of "fair equality of opportunity". In the context of a simple model of online decision making, we give an algorithm that satisfies this fairness constraint, while still being able to learn at a rate that is comparable to (but necessarily worse than) that of the best algorithms absent a fairness constraint. We prove a regret bound for fair algorithms in the linear contextual bandit framework that is a significant improvement over our companion paper, which gives black-box reductions in a more general setting. We analyze our algorithms both theoretically and experimentally. Finally, we introduce the notion of a "discrimination index", and show that standard algorithms for our problem exhibit structured discriminatory behavior, whereas the "fair" algorithms we develop do not.