**
CIS 700/04
ADVANCED TOPICS IN
MACHINE LEARNING
FALL 2003**

URL for this page:
**http://www.cis.upenn.edu/~mkearns/teaching/cis700**

** INSTRUCTOR **

**Prof. Michael Kearns **

** mkearns@cis.upenn.edu **

** Office hours: By appointment **

** COURSE LOCATION AND TIME **

Tuesday and Thursday 12-1:30, in
** 307 Levine Hall. PLEASE NOTE
CHANGE IN ROOM.**

** COURSE DESCRIPTION **

This seminar course will examine selected recent developments in
machine learning research from the literature. While more detail will
be provided as it develops, potential topics include things like:

I am also open to suggestions for both topics and specific readings.

** COURSE FORMAT AND REQUIREMENTS **

The course will be run in seminar/reading group format. Each week we will
examine a couple of papers, and each session will have one or more participants
who will lead an informal discussion of the chosen material (not give a formal
presentation). I will probably also use the course as a venue for external speakers.

Class participation will be determine approximately half of the course grade, with the other half being based on a course project.

Auditors are welcome, as are sporadic and one-time attendees who drop in for subjects that interest them.

** PREREQUISITES **

Familiarity with the basics of machine learning, statistics and
analysis of algorithms.

** SCHEDULE AND READINGS **

** Th Sep 4:
** We'll start with a mainly organizational meeting
to discuss possible topics for the term, and to start plotting
schedule and volunteers.

**
Tu Sep 9, Th Sep 11: Support Vector Machines and Kernel Methods **

I will start things off by leading discussions of some portions of the
papers below. There is likely to be lots of overlap between the first three items,
which are all tutorials, so at a minimum I'd like to suggest that everyone take
a good look at at least one of them.

**
Tu Sep 16, Th Sep 18: SVMs Continued **

We'll finish up our study of SVMs with a couple of more application-oriented
papers. Fernando Pereira graciously offered to lead the discussion on the
second paper below.

**
Tu Sep 23, Th Sep 25: Clustering, and a Special Seminar **

Tu Sep 23: Jon Kleinberg has a really thought-provoking
impossibility result that we will examine, led by
**
Nick Montfort.
**

For Th Sep 25, we'll have the following special seminar by
**
Prof Yishay Mansour
**
a well-known researcher in computational learning theory, computational game theory,
and distributed computation. This talk mixes all of these topics, and can be viewed
as a distributed learning result in a game-theoretic setting.

Title: Convergence Time to Nash Equilibria for Load Balancing

Speaker: Yishay Mansour

Affiliation: Tel-Aviv University

Abstract: We study the number of steps required to reach a pure Nash Equilibrium in a load balancing scenario where each job behaves selfishly and attempts to migrate to a machine which will minimize its cost.

We consider variety of load balancing models, including identical, restricted, related and unrelated machines. Our results have a crucial dependence on the allowable weights assigned to jobs. We consider arbitrary weights, integer weights, K distinct weights and identical (unit) weights. We look both at an arbitrary schedule (where the only restriction is that a job migrates to a machine which lowers its cost) and specific efficient schedulers (such as allowing the largest weight job to move first).

This is a joint work with Eyal Even-Dar and Alex Kesselman.

**
Tu Sep 30, Th Oct 2, Tu Oct 7: Clustering Continued **

Sanjoy Dasgupta has made a number of fundamental insights and contributions to
clustering over the past few years. Sham Kakade has also suggested we look
at the related Vempala/Wang paper below.

**
Th Oct 9: Special Talk --- Prof Lawrence Saul **

**
Prof Lawrence Saul
**
of CIS will give a special talk on very recent work
directly related to our clustering investigation. His abstract is below.
I also strongly encourage people to hit
**
Sebastian Seung's talk at IRCS, Fri Oct 10 at noon.**
He is a terrific speaker.

TITLE: NONLINEAR DIMENSIONALITY REDUCTION BY SEMIDEFINITE PROGRAMMING

ABSTRACT: Can we detect low dimensional structure in high dimensional data sets? The problem of dimensionality reduction is fundamental to machine learning, pattern recognition, and statistics. In this talk, I will describe a new solution to this problem based on semidefinite programming. The resulting algorithm can be used to analyze high dimensional data that lies on or near a low dimensional manifold. The algorithm overcomes various limitations of previous work in manifold learning (e.g., Isomap, LLE, graph laplacians). It also bridges two recent developments in machine learning: semidefinite programming for learning kernel matrices and spectral methods for nonlinear dimensionality reduction.

Joint work with Kilian Weinberger (U Penn).

**
Th Oct 16, Tu Oct 21: Clustering Data Streams **

**
Prof Sudipto Guha
**
Sudipto Guha will lead us through recent algorithms for on-line clustering of
data streams. More references to follow.

**
Th Oct 23, Tu Oct 28, Th Oct 30: Ranking **

**
Libin Shen
**
will lead us through some of the many recent interesting papers on the problem
of learning ordinal rankings of alternatives.

**
Tu Nov 4, Th Nov 6: Dimensionality Reduction Revisited **

**
Kilian Weinberger
**
will take us through a set of related papers on dimensionality
reduction that are closely related to his ongoing work with Lawrence Saul
on the topic.

**
Tu Nov 11, Th Nov 13: Boosting, Online Learning, and Game Theory **

Kobby Essien will lead us through a set of related papers on online prediction and game-theoretic
learning frameworks.

**
Nov 20 and Dec 2: Approaches to Multiclass Learning **

**
Yuan Ding
**
will lead our study of algorithms and models for
dealing with multiclass learning problems, many of which have a coding-theoretic element.

**
Dec 4: SPECIAL TALK **

Title: Dimension reduction and information theory - what's the connection?

Speaker: Naftali Tishby, The Hebrew University

Abstract:

In this talk I will take a new look at Shannon's Information theory from a Machine Learning perspective. I will argue that Shannon's theory provides a compelling mechanism for quantifying the fundamental tradeoff between complexity and accuracy, by unifying the source and channel coding theorems into one principle which I call the "Information Bottleneck" (IB). This unified view of the coding theorems can shed new light on the question of "relevant data representation" and suggests new algorithms for extracting such representations from co-occurrence statistics. In the second part I will briefly examine two different applications of this principle to continuous dimension reduction and feature extraction from high dimensional datasets, Sufficient Dimensionality Reduction (SDR) and Gaussian-IB, both provide nonlinear continuous dimension reduction that preserve relevant information.

More information and other related papers can be found at http://www.cs.huji.ac.il/labs/learning/Papers/IBM_list.html

**
Possible Future Material **