Science of Data Ethics (CIS 399)
Spring 2021
All lectures virtual and recorded live
Tuesdays and Thursdays 10:30-12PM ET


Prof. Michael Kearns
Office hours (virtual): Tuesdays at noon (right after lecture) or by appointment.

Teaching Assistants:

Sheng Gao
Office Hours: Wednesdays 11-12AM ET or by appointment.

Subin Lee
Office Hours: Tuesdays 3-4PM ET or by appointment.

Hua Wang
Office Hours: Fridays 3:30-4:30PM ET or by appointment.

Course Description

This course is about the social and human problems that can arise from algorithms, AI and machine learning, and how we might design these technologies to be "better behaved" in the first place. It is first and foremost a science or engineering course, since we will be developing algorithm design principles. You can get a broad sense of course themes and topics by visiting the websites for the 2020 and 2019 offerings.

Prerequisites: Familiarity with some machine learning, basic statistics and probability theory will be helpful. While this is not a theory class, you need to be comfortable with mathematical notation and formalism. There will be some simple coding and data analysis assignments, so some basic programming ability is needed.

Course content will include readings from the scientifc literature, the mainstream media and other articles and books.

Grades will be based on homeworks and quizzes, as well as class participation and a book club presentation.

Important: All lectures for the course will be recorded live each Tuesday and Thursday from 10:30AM to noon ET. While I realize it's not possible for some of you, I'd like to encourage anyone who is able to attend the live lectures, participate in discussions, and ask questions. This engagement makes the lectures better for everyone --- me, the attendees, and those watching the recordings.

Here is the Zoom link for the live sessions. After each lecture, the recordings and lecture notes will be posted to the schedule below.

Date                             Topics Slides, Readings, Assignments, Announcements
Thu Jan 21
Tue Jan 26
Course Introduction and Overview
Lecture videos: [Jan 21]   [Jan 26]

Lecture notes

A general-audience introduction to some of the themes of the course is given in the (recommended but not required) book The Ethical Algorithm: The Science of Socially Aware Algorithm Design, by M. Kearns and A. Roth.

Also recommended but not required: Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy, by C. O'Neil.

Thu Jan 28
Tue Feb 2
Thu Feb 4
Foundations of Machine Learning Lecture videos: [Jan 28]   [Feb 2]   [Feb 4]

Lecture notes

Here is Homework 1, tentatively due Feb 16.

There is good and comprehensive set of videos and readings related to many of the topics we covered in these lectures in this Google machine learning course.

Tue Feb 9
Thu Feb 11
Tue Feb 16
Thu Feb 18
(Un)Fairness in Machine Learning: COMPAS and ProPublica

Lecture videos: [Feb 9]   [Feb 11]   [Feb 16]   [Feb 18]

Lecture notes

The following readings are required; you should read the two ProPublica pieces before the Feb 9 lecture so we can discuss them then.

ProPublica article on COMPAS

ProPublica analysis

COMPAS Risk Assesment Survey (just skim)

Practitioner's Guide to COMPAS Core (no need to read, but we'll peruse a bit together in lecture)

ProPublica github repository, including dataset (we'll look at the dataset a bit in lecture)

Northpointe response to ProPublica (just skim)

Tue Feb 23
Thu Feb 25
Tue Mar 2
Thu Mar 4
Fairness in ML: Models and Algorithms Lecture videos: [Feb 23]   [Feb 25]   [Mar 2]   [Mar 4]

Lecture notes


Inherent Trade-Offs in the Fair Determination of Risk Scores, J. Kleinberg, S. Mullainathan, M. Raghavan. (read first 8 pages)

The Frontiers of Fairness in Machine Learning. Alexandra Chouldechova, Aaron Roth. (read entire article)

Please play around with the following Google demo site on fairness and ML.

Here is the zip file for Homework 2, which is due Monday March 8.