|
This talk will first
introduce noun phrase coreference resolution, one of the critical
problems that currently limits performance for many practical
natural language processing tasks. Briefly, the goal for noun
phrase coreference algorithms is to determine which noun phrases
in a text or dialogue refer to the same real-world entity.
We then examine coreference
resolution in the context of two machine learning paradigms ---
supervised learning and weakly supervised learning. Supervised
approaches to coreference resolution have, in general, been relatively
successful, operating primarily by training a classifier that
determines whether or not two noun phrases are co-referring. Supervised
methods, however, require large amounts of annotated training
data that is expensive and difficult to obtain. Weakly supervised
approaches, on the other hand, aim to obtain classifiers using
much less human-labeled data --- by bootstrapping a large set
of automatically labeled data from a very small set of labeled
instances.
We will describe the
specific supervised and weakly supervised algorithms that we have
applied to the problem of noun phrase coreference resolution,
present our empirical results on two standard coreference data
sets, and discuss the problems encountered in applying each framework
to the coreference task.
This is joint work with
graduate student Vincent Ng.
|