PENN CIS 625, SPRING 2016: COMPUTATIONAL LEARNING THEORY

Prof. Michael Kearns
mkearns@cis.upenn.edu

Time: Mondays 12-3 PM
Location: 307 Levine

IMPORTANT NOTE: The first course meeting will be on Weds Jan 13 from 12-3, and held in Greenberg Lounge of Skirkanich Hall.

URL for this page:
www.cis.upenn.edu/~mkearns/teaching/COLT

Previous incarnations of this course:
www.cis.upenn.edu/~mkearns/teaching/COLT/colt15.html (with Grigory Yaroslavtsev)
www.cis.upenn.edu/~mkearns/teaching/COLT/colt12.html (with Jake Abernethy)
www.cis.upenn.edu/~mkearns/teaching/COLT/colt08.html (with Koby Crammer)


COURSE DESCRIPTION

This course is an introduction to Computational Learning Theory, a field which attempts to provide algorithmic, complexity-theoretic and probabilistic foundations to modern machine learning and related topics.

The first part of the course will closely follow portions of An Introduction to Computational Learning Theory, by M. Kearns and U. Vazirani (MIT Press). We will cover perhaps 6 or 7 of the chapters in K&V over (approximately) the first half of the course, often supplementing with additional readings and materials. Copies of K&V will be available at the Penn bookstore. The second portion of the course will focus on a number of models and topics in learning theory and related areas not covered in K&V.

The course will give a broad overview of the kinds of problems and techniques typically studied in Computational Learning Theory, and provide a basic arsenal of powerful mathematical tools for analyzing machine learning problems.

Topics likely to be covered include:

  • Basics of the Probably Approximately Correct (PAC) Learning Model
  • Occam's Razor, Compression and Learning
  • Uniform Convergence and the Vapnik-Chervonenkis Dimension
  • Learning in the Presence of Noise and Statistical Query Learning
  • Learning and Cryptography
  • Query Models
  • Boosting
  • Online and No-Regret Learning
  • Discrete Fourier Methods in Learning
  • Agnostic Learning
  • Property Testing and Learning

    Additional topics we may also cover include:

  • Active Learning
  • Learning Theory and Algorithmic Mechanism Design
  • Learning Theory and Differential Privacy
  • Learning Theory and Computational Models of Evolution
  • PAC-style Analyses in Reinforcement Learning
  • PAC-style Analyses in Probabilistic Inference


    COURSE FORMAT, REQUIREMENTS, AND PREREQUISITES

    Much of the course will be in fairly traditional "chalk talk" lecture format, but with ample opportunity for discussion, participation, and critique. The course will meet once a week on Mondays from 12 to 3 (but see note above about the first meeting). Lunch will be served.

    The course will involve advanced mathematical material and will cover formal proofs in detail, and will be taught at the doctoral level. While there are no specific formal prerequisites, background or courses in algorithms, complexity theory, discrete math, combinatorics, probability theory and statistics will prove helpful, as will "mathematical maturity" in general. We will be examining detailed proofs throughout the course. If you have questions about the desired background, please ask. Auditors and occasional participants are welcome.

    The course requirements for registered students will be some to-be-determined mixture of active in-class participation, problem sets, possibly leading a class discussion, and a final project. The final projects can range from actual research work, to a literature survey, to solving some additional problems.


    DETAILED MEETING/TOPIC SCHEDULE
    The exact timing and set of topics below will depend on our progress and will be updated as we proceed.

    Wed Jan 13
    In the first meeting, we will go over course mechanics and present a course overview, then immediately begin investigating the Probably Approximately Correct (PAC) model of learning. Detailed topics covered: Learning rectangles in the real plane; definition of the PAC model; PAC-learnability of rectangles in d dimensions.

    READING: K&V Chapter 1.

    Mon Jan 18
    No lecture, MLK holiday.

    Mon Jan 25
    No lecture, Prof Kearns out of town.

    Mon Feb 1
    PAC-learnability of conjunctions of boolean variables. Intractability of PAC-learning 3-term DNF; learning 3-term DNF by 3CNF and the importance of hypothesis representation.

    Mon Feb 8
    Sample complexity and the Vapnik-Chervonenkis dimension.

    Mon Feb 15
    Guest lecture by Jamie Morgenstern; sample complexity of auctions.

    Mon Feb 22
    Representation-independent intractable ("truly hard") learning problems via cryptography.

    Mon Feb 29
    Learning and cryptography continued; introduction to boosting.

    Mon Mar 7
    Spring Break, no lecture.

    Mon Mar 14
    Boosting continued: Adaboost and top-down decision tree learning.

    Problem Set --- Due in hardcopy format in class, Monday April 11: K&V: 1.1, 1.3, 3.1, 3.2, 4.3, 5.2, 5.6, 8.1, 8.3. You are free to collaborate on the problem set, but please turn in separate write-ups and acknowledge your collaborations. Please do not attempt to look up solutions in the literature.

    Mon Mar 21
    Learning with noise and the statistical query model.

    Mon Mar 28
    Guest lecture by Steven Wu; statistical query learning and differential privacy.

    Mon Apr 4
    Learning with membership and equivalence queries: deterministic finite automata.

    Mon Apr 11
    TBD.

    Mon Apr 17
    TBD.