CIS 639

Statistical Approaches to Natural Language Processing

Spring 2002

 

(Constantly Under Construction)

 

Mitch Marcus

(mitch@linc.cis.upenn.edu)

Office: Moore 461a

Phone: 215-898-2538

 

 

See here for course syllabus and overview. 

 

            Readings available on the web & additional readings

HMMs:

For more info: Jelinek, F. Statistical Methods for Speech Recognition. MIT Press: Cambridge (1998), Chapter 2.

Eric Brill, A Simple Rule-Based Part Of Speech Tagger, Proceedings of ANLP-92, 3rd Conference on Applied Natural Language Processing

 

Eric Brill, Some Advances in Transformation-Based Part of Speech Tagging, AAI, Vol. 1, 1994

 

 

The class will interleave three modes:

 

  1. Lectures on the contents of Section III of  Manning & Schütze, Foundations of Statistical Natural Language Processing, 
  2.  Student-led discussions of recent papers on NP Chunking from the group of papers to be found

 

  1. Group discussion of the details of maximum entropy and generative probabilistic models for statistical NLP included in Michael Collin's Ph.D. dissertation and Adwait Ratnaparkhi's Ph.D. dissertation. 

 

Required work will include leading a discussion of selected papers, a final paper or course project, and two or three exercises during the semester.