Spectral methods for capturing word meaning

Words can be characterized by the contexts they appear in; "ruby" in "a ruby necklace" is not the same as in "a ruby script". The first phrase is more like a "pearl necklace" and the second more like a "perl script". We are developing a set of methods based on singular value decomposition (SVD) that efficiently compute "eigenfeature" vectors that characterize the meanings of words based on the contexts they appear in. Our eigenfeatures are useful in wide variety of NLP tasks including part of speech labeling, word sense disambiguation, named entity recognition, and parsing. The spectral methods have strong theoretical properties, which give estimation algorithms with performance guarantees for many classic problems such as estimating Hidden Markov Models (HMMs),

Collaborators

CIS: Paramveer Dhillon, Indrepreet Nanda
Statistics

Eigenfeature Dictionaries

coming soon

Papers

coming soon

Software

coming soon

Lyle H. Ungar

ungar@cis.upenn.edu

Dean P. Foster

foster@wharton.upenn.edu