CIS 620 Spring 2005
Machine-Learning Models and Algorithms for Structured Data
Fernando Pereira
Many important problems in machine learning involve implicitly or explicitly
structured data:
- Classify hyperlinked Web pages
- Find genes in genetic sequence data
- Parse natural language text
- Model the probability of speech transcriptions
- Segment images
- Find the correferences in natural-language text
Important: presentation schedule
This edition of CIS 620 will be a lecture course on learning techniques
for structured data. We will start with a review of fundamental prerequisites
- Introduction 1/11 PDF
- Linear binary classification and large margin 1/13-1/18 lecture
slides; homework 1; homework
data
- A variety of additional tutorial material on support vector
machines and kernel methods is listed here.
The following are recommended
They are all a bit more dense mathematically than the lectures, and they
cover additional material, but you should at least skim them to reinforce
the lecture material.
- Logistic regression and maximum-entropy models 1/18-1/20 lecture
slides
- Undirected graphical models
- A teeny bit of convex optimization
We'll then proceed to the main topics:
Coursework will include:
- A detailed topic presentation in class, including preparation
of slides/handouts with main concepts, equations, algorithms, and theorems.
- Several homeworks that will test concepts or attempt to reproduce
standard algorithms or results in the literature on modestly-sized datasets.