Noun Phrase Recognition (Grouping)

Describes the noun-phrase recognition services of the LinguistX Platform library, version 2.2. For specific information about language-specific behavior of each of the language modules, please see the documentation that is shipped with the language modules.

Contents

  1. Introduction
  2. Specfic Languages

Introduction

The phrase extractor works on one or more complete sentences at a time, and finds simple noun phrases in the input. (For the purposes of this application, "simple noun phrases" don't include determiners--they are closer to what a linguist might call "N-bar".) Running noun phrase recognition requires first running tokenization and part-of-speech tagging. The input to the phrase extractor is the list of part-of-speech tags output by the tagger.

Default behavior of the phrase extractor is to find noun phrases of maximum extent in the input. For example, given the sentence:

The President considered the impact of foreign trade policy on American businesses.

The maximum-extent noun phrases found are:

President
impact of foreign trade policy
American businesses

The phrase extractor can also be directed to give all sub-phrases. Finding all subgroups results in:

President
impact of foreign trade policy
American businesses
impact
impact of foreign trade
foreign trade
foreign trade policy
trade
trade policy
policy
businesses


General Contents