CIS 530 -- Introduction to Natural Language Processing 2008


COURSE STRUCTURE
* MODULES AND NOTES * ASSIGNMENTS * RESOURCES

Instructor
Mitch Marcus  
Office: Levine 503 
mitch (AT) cis.upenn.edu 

Office Hours: TBA
Teaching Assistant
Lucas Champollion
Office: Williams 401/402
champoll (AT) ling.upenn.edu

Office Hours: Tuesdays, 11am-12noon.  
 
Teaching Assistant
Qiuye Zhao
Office: GRW 571
qiuye (AT) seas.upenn.edu

Office Hours: Mondays, 3pm-4pm  

Class Schedule: Tuesday & Thursday, 4:30pm to 6:00pm, Towne 307

Course Administrator: Cheryl Hickey, 502 Levine, 215-898-3538, cherylh (AT) cis.upenn.edu

COURSE STRUCTURE


Web Page:
http://www.seas.upenn.edu/~cis530/

Textbooks:
  • Jurafsky & Martin, SPEECH and LANGUAGE PROCESSING: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, Draft of 2nd edition (selections available in Ikon Copy Center, Levine Hall).
  • Chris Manning & Hinrich Shutze, Foundations of Statistical Natural Language Processing, MIT Press, 1999. (available online from the Penn campus)
  • Various supplementary readings.

    Homework:
    Homework will be distributed on the lectures and posted on the web page.
    Late homeworks will be penalized.

    Back to Top

    CLASS MODULES

    Links to classroom slides will appear below.

    Lecture Notes are in Microsoft PowerPoint format. You can view them with either Microsoft PowerPoint or the free Microsoft PowerPoint Viewer.

    Module 1: Introduction & Word-Based Methods  

    Module 2: Parsing
    • Introduction to Syntactic Analysis
    • Context Free Models for English Syntax [ Slides ]
    • Basic CF Parsing Algorithms [ Slides ]
    • Statistical Parsing of CFGs
      • Probabilistic CFGs [ Slides ]
      • Generative Statistical Models [ Slides ]
      • Discriminative Models for Parsing
    • Enriched Models for NL Syntax
      • The inadequacy of CF Models
      • Feature Structures and Unification [ Slides ]
      • Tree Adjoining Grammars [ Slides ]

    Module 3: Meaning
    • Lexical Semantics [ Slides ]
      • Word Sense Disambiguation: Decision Lists, SVMs
    • Logical Form and Semantics [ Slides ]
      • Introduction to Logical Form
      • Mapping from Syntactic Structures to LF
    • Practical Methods: Information Extraction and Named Entity Recognition [ Slides ]
    • Practical Methods: Naive Bayes for Spam Filtering; SVMS, Perceptrons Recognition [ Slides ]
    • Discourse & Pragmatics
      • Text Coherence & Discourse Structure

    Module 4: Putting the Pieces Together
    • Machine Translation [ Slides ]
      • Statistical Translation: The state of the art [ Slides ]

      ** We are now here (more or less) - 4/28/2008 **
    • Generation & Summarization
      • Text Planning, Content Determination and Realization
      • Statistical Techniques for Summarization

    Back to Top

    HOMEWORK ASSIGNMENTS

    How to submit:
    • 1. connect to eniac.seas.upenn.edu
    • 2. type the command ' turnin -c cis530 -p hwx filename'
    • 3. If the system requires to choose the section, type 'ALL'


    Assignment I


    Assignment II


    Assignment III
    • Due: Tuesday, Mar. 18th, 2008.   data




    Back to Top

    OTHER RESOURCES

    Python Resources


    Back to Top


    For more information, please contact mitch (AT) cis.upenn.edu

    Back to the CIS homepage

  • Last changed: Monday, April 28, 2008