PENN CIS 625, FALL 2020: THEORY OF MACHINE LEARNING

Prof. Michael Kearns
mkearns@cis.upenn.edu

Time: Tuesdays and Thursdays 10:30AM-Noon
Location: Virtual lectures via Zoom at this URL. All students are strongly encouraged to attend lectures "live" and with video on to increase engagement and interaction. However, all lectures will be recorded and notes provided to accomodate those in other time zones.
Office Hours: Prof. Kearns will hold office hours from noon to 1PM after each Tuesday's lecture at this URL.

URL for this page:
www.cis.upenn.edu/~mkearns/teaching/CIS625

Previous incarnations of this course:
www.cis.upenn.edu/~mkearns/teaching/COLT/colt18.html
www.cis.upenn.edu/~mkearns/teaching/COLT/colt17.html
www.cis.upenn.edu/~mkearns/teaching/COLT/colt16.html
www.cis.upenn.edu/~mkearns/teaching/COLT/colt15.html (with Grigory Yaroslavtsev)
www.cis.upenn.edu/~mkearns/teaching/COLT/colt12.html (with Jake Abernethy)
www.cis.upenn.edu/~mkearns/teaching/COLT/colt08.html (with Koby Crammer)


COURSE DESCRIPTION

This course is an introduction to the theory of machine learning, and provides mathematical, algorithmic, complexity-theoretic and probabilistic/statistical foundations to modern machine learning and related topics.

The first part of the course will follow portions of An Introduction to Computational Learning Theory, by M. Kearns and U. Vazirani (MIT Press). We will cover perhaps 6 or 7 of the chapters in K&V over (approximately) the first half of the course, often supplementing with additional readings and materials. Copies of K&V will be available at the Penn bookstore; I'm also looking into providing electronic access, as I do for the first chapter in the schedule below. The second portion of the course will focus on a number of models and topics in learning theory and related areas not covered in K&V.

The course will give a broad overview of the kinds of theoretical problems and techniques typically studied and used in machine learning, and provide a basic arsenal of powerful mathematical tools for analyzing machine learning problems.

Topics likely to be covered include:

  • Basics of the Probably Approximately Correct (PAC) Learning Model
  • Occam's Razor, Compression and Learning
  • Concentration, Uniform Convergence and the Vapnik-Chervonenkis Dimension
  • Boosting and Related Techniques
  • Learning in the Presence of Noise and Statistical Query Learning
  • Learning and Cryptography
  • Query Models
  • Online and No-Regret Learning Models
  • Agnostic Learning
  • Game Theory and Machine Learning
  • Learning and Differential Privacy
  • Fairness in Machine Learning
  • Theory of Reinforcement Learning
  • Theory of Deep Learning


    COURSE FORMAT, REQUIREMENTS, AND PREREQUISITES

    Much of the course will be in fairly traditional "chalk talk" lecture format (virtually on Zoom), but with ample opportunity for discussion, participation, and critique. The course will meet Tuesdays and Thursdays from 10:30 to noon, with the first meeting on Tuesday September 1.

    The course will involve advanced mathematical material and will cover formal proofs in detail, and will be taught at the doctoral level. While there are no specific formal prerequisites, background or courses in algorithms, complexity theory, discrete math, combinatorics, convex optimization, probability theory and statistics will prove helpful, as will "mathematical maturity" in general. We will be examining detailed proofs throughout the course. If you have questions about the desired background, please ask. Auditors and occasional participants are welcome.

    The course requirements for registered students will be a mixture of active in-class participation, problem sets, possibly leading a class discussion, and a final project. The final projects can range from actual research work, to a literature survey, to solving some additional problems.


    MEETING/TOPIC SCHEDULE
    The exact timing and set of topics below will depend on our progress and will be updated as we proceed.

    Tue Sep 1
    Course overview, topics, and mechanics.

  • Lecture notes 1.
  • Lecture video.

    Thu Sep 3
    Model, algorithm and analysis for the rectangle learning problem.

  • Lecture notes 2.
  • Lecture video.
  • K&V Chapter 1.

    Tue Sep 8
    Introduction to the PAC model.

  • Lecture notes 3. We stopped on page 7.
  • Lecture video. There were tech problems at the start, so skip to about minute 17 for the start of the action.
  • K&V Chapter 1.

    Thu Sep 10
    Learning Boolean conjunctions in the PAC learning model.

  • Lecture notes 3. We stopped on page 15.
  • Lecture video
  • K&V Chapter 1.

    Tue Sep 15
    Hardness of PAC-learning 3-term DNF.

  • Lecture notes 3. We stopped on page 23.
  • Lecture video.
  • K&V Chapter 1.

    Thu Sep 17
    PAC learning 3-term DNF learnability by 3CNF. Introduction to consistency, compression and learning.

  • Lecture notes 3. We stopped at the end.
  • Lecture notes 4. We stopped on page 7.
  • Lecture video.
  • K&V Chapter 1.
  • K&V Chapter 2.

    Tue Sep 22
    Consistency, compression and learning, continued.

  • Lecture notes 4.
  • Lecture video.
  • K&V Chapter 2.
  • Problem Set 1.

    Thu Sep 24
    Consistency, compression and learning, continued. Discussion of Problem Set 1.

  • Lecture notes 4. We stopped at the end.
  • Lecture video.
  • K&V Chapter 2.

    Tue Sep 29
    Introduction to the VC dimension.

  • Lecture notes 5. We stopped on page 20.
  • Lecture video.
  • K&V Chapter 3.

    Thu Oct 1
    VC dimension continued: Sauer's Lemma.

  • Lecture notes 5. We stopped on page 29.
  • Lecture video.
  • K&V Chapter 3.

    Tue Oct 6
    VC dimension continued: probalistic analysis, two-sample trick.

  • Lecture notes 5. We stopped on page 40.
  • Lecture video.
  • K&V Chapter 3.

    Thu Oct 8
    VC dimension continued: sample size lower bounds; structural risk minimization; distribution-dependent improvements.

  • Lecture notes 5. We stopped on page 51.
  • Lecture video.
  • K&V Chapter 3.

    Tue Oct 13
    VC dimension finished (finally): data-dependent improvements; other loss functions. Musings on Problem Set 1.

  • Lecture notes 5. We stopped at the end.
  • Musing on Problem Set 1.
  • Lecture video.
  • K&V Chapter 3.

    Thu Oct 22
    More musings on Problem Set 1; introduction to learning with classification noise (CN).

  • Lecture notes 6. We stopped on page 5.
  • Lecture video.
  • K&V Chapter 5.

    Tue Oct 27
    Learning with CN continued; the statistical query (SQ) model.

  • Lecture notes 6. We stopped on page 18.
  • Lecture video.
  • K&V Chapter 5.

    Thu Oct 29
    The SQ model continued: SQ learning implies CN learning; hardness of parity functions in SQ.

  • Lecture notes 6. We stopped on page 38.
  • Lecture video.
  • K&V Chapter 5.
  • Problem Set 2.

    Tue Nov 3
    The SQ model continued: the SQ dimension; hardness of learning decision trees and DNF in the SQ model.

  • Lecture notes 6. We stopped on page 45.
  • Lecture video.
  • K&V Chapter 5.

    Thu Nov 5
    The PAC Solar System: containments and separations; cryptographic hardness; PAC with membership queries. Preview of Problem Set 2.

  • Lecture notes 6. We stopped at the end.
  • Lecture video.

    Tue Nov 10
    Boosting: model and problem formulation; Schapire's ternary majority construction.

  • Lecture notes 7. We stopped on page 15.
  • Lecture video.

    Thu Nov 12
    Adaboost: algorithm and analysis; connection to weak and strong PAC learning.

  • Lecture notes 7. We stopped at the end.
  • Lecture video.

    Tue Nov 17
    Introduction to no-regret learning; Multiplicative Weights algorithm and analysis.

  • Lecture notes 8. We stopped on page 12.
  • Lecture video.

    Thu Nov 19
    MW analysis continued; discussion and connection to PAC; lower bounds on regret.

  • Lecture notes 8. We stopped on page 22.
  • Lecture video.
  • An excellent survey article on MW by Arora, Hazan and Kale.