CIS 630 -- Machine Learning for Language Processing
The main goal of this course is to
get us all working on interesting research problems involving
machine learning in language processing. The choice of topics
reflects what I work on and know about and what I believe is important
to start good work in this area. We will do this by combining
presentation of applications, concepts and techniques with projects
that develop our ability to do publishable research in this area.
The syllabus below may be a superset of what we will actually be
able to cover. I will keep adjusting it depending on what other
questions come up in class.
Prerequisites
Familiarity with basic notions of probability and of machine
learning as provided
CIS520 or equivalent. If you aren't
sure, ask me. But be warned that this is a graduate seminar
intended to get everyone working on open research questions.
Syllabus
Relevant references are listed under each topic, but
we may not be able to discuss all of them in class. I will add more
references as discussion in class suggests.
- Survey applications of machine learning to language processing
- document classification
- document segmentation, tagging, and entity extraction
- parsing
- inducing representations of linguistic objects
- possible guest lectures (depending on guest availability):
- dimensionality reduction methods
- non-negative factorizations
- finite-state techniques
- the latest on probabilistic parsing
- reinforcement learning in dialog systems
- Introduce techniques and open research questions
- Generative vs discriminative:
- modeling the joint distribution of labels and documents
- modeling the conditional distribution of labels given documents
- conditional maximum entropy
- comparing generative and discriminative probabilistic models
- minimizing (bounds on) classification error
- maximum margin (boosting, SVM)
- online (winnow)
- Limited training data
- smoothing and model averaging
- how sensitive are different methods to small samples, noise?
- using unlabeled data
- modeling sequences: probabilistic mappings vs local classifiers
- joint models: hidden Markov Models
- conditional models: maximum-entropy Markov models and random
fields
- partial labeling: expectation-maximization with the forward-backward
algorithm
- local classifiers: boosting, winnow
- creating representations by unsupervised learning
- latent variable models
- information bottleneck
- smoothing
- feature induction
- Practice research methods
- surveying the literature
- summarizing mathematical models and algorithms
- identifying workable open questions
- designing informative experiments
- reporting results effectively
- Produce reusable results
- survey of the literature
- "gentle introduction to X"
- practical implementation of previously impractical algorithm
- critical data set for evaluating an important hypothesis
- publishable paper (conference deadlines coming up early 2002!)
Format (tentative)
- What I do:
- Give introductory lectures on each topic
- Select appropriate papers
- Steer discussion
- Comment when necessary
- Suggest and advise on projects
- Organize a course Web site with links to course products
- What you do:
- Individually:
- Present a comparative discussion of a few papers on a given
topic
- Take responsibility for a well-defined part of a project
- Serve as discussant of a project other than yours
- Individually or as member of a work group:
- Present a project proposal (15-20 minutes)
- Work on the project
- Present a mid-term project status report (10-15 minutes)
- Write a project report accessible on the Web to the class
- Present the project report (15-20 minutes), discussed by pre-appointed
discussants
Grading (if you must know)
40% class participation, 60% project quality. A top-notch project
is something that with a bit of fine-tuning could be submitted confidently
to a conference such as ICML, ACL, EMNLP/SIGDAT, NIPS. The minimum
project is something -- a survey, introduction, implementation --
sufficiently well done that interested people can use it and learn
from it for at least the next few years.