Do you trust your model? Despite their widespread adoption and impressive performance, modern machine learning models have a crucial flaw: it is extremely difficult to discern when and how models fail. This pitfall has given rise to a field of research known as trustworthy machine learning, in order to make these systems safe, responsible, and understandable.
This course will explore the tools and methods for analyzing the machine learning pipeline and assessing their trustworthiness (or lack thereof), from the datasets, models, and predictions perspective. A tentative schedule of these topics can be found at the bottom of this page.
Class: Tues 1:45-3:15pm Eastern, DRLB 4C6 / Thurs 1:45-3:15pm Eastern, CHEM 514
Ed discussion: Self sign-up link
Mask policy: Masks are required.
Students from all majors and degree levels are welcome. There are no specific course requirements, but a background in machine learning at an introductory course level is expected, as well as basic programming experience for the course project.
Grading will be based off of 80% course project (15% proposal + 20% progress report + 25% final report + 20% presentation) and 20% participation (5% readings + 15% discussion). There will be no homeworks or exams.
This class will combine lectures and discussions. The lectures will typically cover the core groundwork, followed by a student-led in-depth discussion based on assigned readings. Readings and lecture materials will be posted on the schedule.
As part of this course, students will inspect and debug machine learning problems for deficiencies in settings of their choice. All parts of the pipeline are fair game, including data collection, training algorithms, models and architectures, the resulting predictions, and even the debugging tools themselves. This can take the form of an audit (identifying the shortcomings of a fixed pipeline) or a patch/update (changing the pipeline to fix a problem). Example projects at various stages in the pipeline include the following:
- Are there biases, spurious correlations, or underrepresented subpopulations? For example, does US census data have any blind spots or misleading correlations?
- Where do these problems stem from, and how does this impact downstream predictions?
- Can we fix the data or collection procedure to mitigate these issues?
- Methods and architectures:
- Do ML algorithms (i.e. fairness / privacy / adversarial robustness / security) for fixing models via training actually achieve their goal?
- Can you pinpoint or characterize the failures of modern architectures (such as large language models)?
- Can you construct counterexamples / subpopulations that exemplify the failure modes of these models and algorithms, or guarantee that such failure modes don’t exist?
- Interpretability and predictions:
- How faithful are explainability methods to the actual model predictions?
- Are the type of explanations we can generate aligned with what practitioners need?
- For example, do analysis tools for diagnosing health conditions tell doctors useful and meaningful information?
Tentative schedule and topics
The schedule and topics can change based on students’ interests and as time permits. If you don’t see something you’d like to learn about, send me an email.
There is no official textbook for this course, but you may find the following references to be useful: