CIS Homeline

 

CIS Home divider Penn Engineering divider PENN   spacer
 

 
 Claire Cardie: Machine Learning for Noun Phrase Coreference  

This talk will first introduce noun phrase coreference resolution, one of the critical problems that currently limits performance for many practical natural language processing tasks. Briefly, the goal for noun phrase coreference algorithms is to determine which noun phrases in a text or dialogue refer to the same real-world entity.

We then examine coreference resolution in the context of two machine learning paradigms --- supervised learning and weakly supervised learning. Supervised approaches to coreference resolution have, in general, been relatively successful, operating primarily by training a classifier that determines whether or not two noun phrases are co-referring. Supervised methods, however, require large amounts of annotated training data that is expensive and difficult to obtain. Weakly supervised approaches, on the other hand, aim to obtain classifiers using much less human-labeled data --- by bootstrapping a large set of automatically labeled data from a very small set of labeled instances.

We will describe the specific supervised and weakly supervised algorithms that we have applied to the problem of noun phrase coreference resolution, present our empirical results on two standard coreference data sets, and discuss the problems encountered in applying each framework to the coreference task.

This is joint work with graduate student Vincent Ng.


 
 
CIS Home divider Penn Engineering divider PENN   spacer