Assignment 11: Incremental finite-state parsing
For this assignment you need to run the xfsp application,
Salah Aït-Mokhtar's tool for developing incremental finite-state
parsers. Please read the documentation in
http://www.cis.upenn.edu/~cis639/docs/xfsp/
Start by creating a directory that contains the following five files:
You first need to compile the xfst parser scripts, NP.xfst and
PP.xfst, to create the corresponding network files NP.fst and PP.fst.
These files will be created by the scripts. Note that the transducers
are inverted before they are saved. The reason is that xfsp
applies the transducers in the upward direction.
Next, launch xfsp with the command
/pkg/cis639/bin/xfsp english.xfsp
and try it on the following text
Although the drivers should have been checking their speedometers to
verify their speed, most of them were too afraid to take their eyes off
the road out of fear that something might emerge from the fog.
You should see the following result:
###########################
PP.fst applied:
###########################
[S Although/Conj-Sub NP[ the/Det-Def drivers/Nn-Pl]NP should/Aux
have/V-Pres been/V-PaPart-be checking/V-Prog NP[ their/Det-Poss
speedometers/Nn-Pl]NP to/Part-Inf verify/V-Pres NP[ their/Det-Poss
speed/Nn-Sg]NP ,/Punct-Comma most/Adj-Sup of/Prep-of them/Pron
were/V-Past-Pl-be too/Adv afraid/Adj to/Part-Inf take/V-Pres NP[
their/Det-Poss eyes/Nn-Pl]NP PP[ off/Prep NP[ the/Det-Def
road/Nn-Sg]NP]PP out/Prep of/Prep-of fear/Nn-Sg that/Conj-Sub
something/Pron might/Aux emerge/V-Pres
PP[ from/Prep NP[ the/Det-Def fog/Nn-Sg]NP]PP
./Punct-Sent ]S
Although, xfsp always applies all the transducers that
are marked for application, it only reports the name of the
last transducer in the chain. In this case, both the
NP and PP transducers were applied.
The Task
Start by improving the NP and PP parsers. Note that some
noun phrases and prepositional phrases in the above example are
not recognized correctly. Then add a recognizer for VPs.
Finally, add a fourth transducer that eliminates all the
part of speech tags leaving just the original words and
the structure you have introduced with your parse rules.
Copyright © 1998 Xerox Corporation.