Science of Data Ethics
All lectures virtual and recorded live
Tuesdays and Thursdays 10:30-12PM ET
Prof. Michael Kearns
Office hours (virtual): Tuesdays at noon (right after lecture) or by appointment.
Office Hours: Wednesdays 11-12AM ET or by appointment.
Office Hours: Tuesdays 3-4PM ET or by appointment.
Office Hours: Fridays 3:30-4:30PM ET or by appointment.
This course is about the social and human problems that can arise from algorithms, AI and machine learning, and how we might design these technologies to be "better behaved" in the first place. It is first and foremost a science or engineering course, since we will be developing algorithm design principles. You can get a broad sense of course themes and topics by visiting the websites for the 2020 and 2019 offerings.
Prerequisites: Familiarity with some machine learning, basic statistics and probability theory will be helpful. While this is not a theory class, you need to be comfortable with mathematical notation and formalism. There will be some simple coding and data analysis assignments, so some basic programming ability is needed.
Course content will include readings from the scientifc literature, the mainstream media and other articles and books.
Grades will be based on homeworks and quizzes, as well as class participation and a book club presentation.
Important: All lectures for the course will be recorded live each Tuesday and Thursday from 10:30AM to noon ET. While I realize it's not possible for some of you, I'd like to encourage anyone who is able to attend the live lectures, participate in discussions, and ask questions. This engagement makes the lectures better for everyone --- me, the attendees, and those watching the recordings.
Here is the Zoom link for the live sessions. After each lecture, the recordings and lecture notes will be posted to the schedule below.
Slides, Readings, Assignments, Announcements
Thu Jan 21
Tue Jan 26
Course Introduction and Overview
A general-audience introduction to some of the themes of the course is given in the (recommended but not required) book The Ethical Algorithm: The Science of Socially Aware Algorithm Design, by M. Kearns and A. Roth.
Also recommended but not required: Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy, by C. O'Neil.
Thu Jan 28
Tue Feb 2
Thu Feb 4
|Foundations of Machine Learning||
Here is Homework 1, tentatively due Feb 16.
There is good and comprehensive set of videos and readings related to many of the topics we covered in these lectures in this Google machine learning course.
Tue Feb 9
Thu Feb 11
Tue Feb 16
Thu Feb 18
|(Un)Fairness in Machine Learning: COMPAS and ProPublica||
The following readings are required; you should read the two ProPublica pieces before the Feb 9 lecture so we can discuss them then.
COMPAS Risk Assesment Survey (just skim)
Practitioner's Guide to COMPAS Core (no need to read, but we'll peruse a bit together in lecture)
ProPublica github repository, including dataset (we'll look at the dataset a bit in lecture)
Northpointe response to ProPublica (just skim)
Tue Feb 23
Thu Feb 25
Tue Mar 2
Thu Mar 4
Tue Mar 9
Tue Mar 16
Thu Mar 18
Tue Mar 23
|Fairness in ML: Models and Algorithms||
[Mar 18 (Berk)]
Inherent Trade-Offs in the Fair Determination of Risk Scores, J. Kleinberg, S. Mullainathan, M. Raghavan. (read first 8 pages)
The Frontiers of Fairness in Machine Learning. Alexandra Chouldechova, Aaron Roth. (read entire article)
Please play around with the following Google demo site on fairness and ML.
Here is the zip file for Homework 2, which is due Monday March 8.
On Thursday March 18, we will finish off our fairness studies with a guest lecture by Penn's own Richard Berk , who will speak on "Politically Correct Criminal Justice Risk Assessment". Please read in advance the following related paper (long and technical, so just peruse for the high-level messages), and this New York Times op-ed.
Thu Mar 25
Thu Apr 1
Tue Apr 6
Thu Apr 8
Tue Apr 13
Thu Apr 15
Tue Apr 20
Thu Apr 22
Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization. P. Ohm. (not assigned, but for your perusal)
Differentially Private Query Release Through Adaptive Projection. S. Aydore, W. Brown, M. Kearns, K. Kenthapadi, L. Melis, A. Roth, A. Siva.
How One of Apple's Key Privacy Safeguards Falls Short. Wired magazine.
Privacy Loss in Apple's Implementation of Differential Privacy on MacOS 10.12 J. Tang, A. Korolova, X. Bai, X. Wang, X. Wang.
See how your community is moving around differently due to COVID-19. Google Covid Mobility Reports.
Here is the zip file for Homework 3, which is due Tuesday April 6.
It's time to start thinking and planning for "book club", which will be held in the final lectures of the semester. (There may be also be a brief written homework assignment on differential privacy.) For book club, you should form teams of no more than five people, and choose a relatively recent book on technology, society and ethics. It's fine and common for the books to be written for a general audience. You will read the book and discuss with your team, and prepare a roughly 15-minute oral/slide presentation of your book in class sessions. An alternative is to identify a couple/few related technical papers from the scientific literature on course themes, and present an overview/synthesis of the papers. More details to come, but action items for now are to form your teams, and to propose your books to me. I'd like to make sure there are no duplicates, and ideally that the books are ones I haven't heard about before.
Wed Apr 28
Thu Apr 29
|Book Club Presentations||
[Apr 28]   [Apr 29]