Science of Data Ethics
(CIS 399)
Spring 2021
All lectures virtual and recorded live
Tuesdays and Thursdays 10:3012PM ET
Instructor:
Prof. Michael Kearns
mkearns@cis.upenn.edu
Office hours (virtual): Tuesdays at noon (right after lecture)
or by appointment.
Teaching Assistants:
Sheng Gao
shenggao@wharton.upenn.edu
Office Hours: Wednesdays 1112AM ET
or by appointment.
Subin Lee
subinlee@wharton.upenn.edu
Office Hours: Tuesdays 34PM ET
or by appointment.
Hua Wang
wanghua@wharton.upenn.edu
Office Hours: Fridays 3:304:30PM ET
or by appointment.
Course Description
This course is about the social and human problems that can arise from algorithms, AI and machine learning, and how we might design these technologies to be "better behaved" in the first place. It is first and foremost a science or engineering course, since we will be developing algorithm design principles. You can get a broad sense of course themes and topics by visiting the websites for the 2020 and 2019 offerings.
Prerequisites: Familiarity with some machine learning, basic statistics and probability theory will be helpful. While this is not a theory class, you need to be comfortable with mathematical notation and formalism. There will be some simple coding and data analysis assignments, so some basic programming ability is needed.
Course content will include readings from the scientifc literature, the mainstream media and other articles and books.
Grades will be based on homeworks and quizzes, as well as class participation and a book club presentation.
Important: All lectures for the course will be recorded live each Tuesday and Thursday from 10:30AM to noon ET. While I realize it's not possible for some of you, I'd like to encourage anyone who is able to attend the live lectures, participate in discussions, and ask questions. This engagement makes the lectures better for everyone  me, the attendees, and those watching the recordings.
Here is the Zoom link for the live sessions. After each lecture, the recordings and lecture notes will be posted to the schedule below.
Date  Topics  Slides, Readings, Assignments, Announcements 

Thu Jan 21 Tue Jan 26 
Course Introduction and Overview 
Lecture videos:
[Jan 21]
[Jan 26]
A generalaudience introduction to some of the themes of the course is given in the (recommended but not required) book The Ethical Algorithm: The Science of Socially Aware Algorithm Design, by M. Kearns and A. Roth. Also recommended but not required: Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy, by C. O'Neil. 
Thu Jan 28 Tue Feb 2 Thu Feb 4 
Foundations of Machine Learning 
Lecture videos:
[Jan 28]
[Feb 2]
[Feb 4]
Here is Homework 1, tentatively due Feb 16. There is good and comprehensive set of videos and readings related to many of the topics we covered in these lectures in this Google machine learning course. 
Tue Feb 9 Thu Feb 11 Tue Feb 16 Thu Feb 18 
(Un)Fairness in Machine Learning: COMPAS and ProPublica 
Lecture videos: [Feb 9] [Feb 11] [Feb 16] [Feb 18] The following readings are required; you should read the two ProPublica pieces before the Feb 9 lecture so we can discuss them then. COMPAS Risk Assesment Survey (just skim) Practitioner's Guide to COMPAS Core (no need to read, but we'll peruse a bit together in lecture) ProPublica github repository, including dataset (we'll look at the dataset a bit in lecture) Northpointe response to ProPublica (just skim) 
Tue Feb 23 Thu Feb 25 Tue Mar 2 Thu Mar 4 
Fairness in ML: Models and Algorithms 
Lecture videos:
[Feb 23]
[Feb 25]
[Mar 2]
[Mar 4]
Readings: Inherent TradeOffs in the Fair Determination of Risk Scores, J. Kleinberg, S. Mullainathan, M. Raghavan. (read first 8 pages) The Frontiers of Fairness in Machine Learning. Alexandra Chouldechova, Aaron Roth. (read entire article) Please play around with the following Google demo site on fairness and ML. Here is the zip file for Homework 2, which is due Monday March 8.
