edu.upenn.cis.spinal
Class ElemTree

java.lang.Object
  extended by edu.upenn.cis.spinal.ElemTree
All Implemented Interfaces:
Serializable

public class ElemTree
extends Object
implements Serializable

Represents a "spinal" elementary tree in Libin's LTAG treebank. A typical elementary tree (taken from p. 73 of his thesis) looks like this: #3 failed spine: a_( S ( VP VBD^ ) ) att #0, on 0, slot 0, order 0 att #2, on 0, slot 0, order 1 att #6, on 0.0, slot 1, order 0 or like this (a structure for predicate coordination): &20 spine: c_( S S S ) crd #3, on 0.0 att #9, on 0, slot 1, order 0 crd #10, on 0.1

Author:
Lucas Champollion, Ryan Gabbard
See Also:
Serialized Form

Field Summary
static int ADJOIN
          Designates a tree combined with its parent by an adjunction operation, indicated by the keyword "adj".
static int ATTACH
          Designates a tree combined with its parent by an attachment operation, indicated by the keyword "att".
static int AUXILIARY
          Designates an auxiliary tree.
static int CONJUNCT
          Designates a tree combined with its parent by an attachment operation, indicated by the keyword "crd" in the output of the incremental parser and in the LTAG-spinal treebank.
static int CONJUNCT_OR_CONNECTIVE
          Designates a tree combined with its parent by a conjunction or an attachment operation, indicated by the keyword "con" in the output of the incremental parser and in the LTAG-spinal treebank.
static int COORD
          Designates a coordination tree -- these trees are special constructs and not anchored in a lexical item.
static int INITIAL
          Designates an initial tree.
static int LEFT
          Designates a tree that attaches/adjoins on the left of its parent.
static int RIGHT
          Designates a tree that attaches/adjoins on the right of its parent.
static int ROOT
          A return value designating a special case when the current tree that has no parent.
static int UNKNOWN
          Designates a field (tree, slot, etc.) of unknown type.
 
Constructor Summary
ElemTree(Sentence container, String representation)
          Creates an ElemTree from a string representation and attaches it into the provided derivation tree (Sentence).
 
Method Summary
 boolean attachesFromLeft()
          Returns true if this ElemTree attaches or adjoins from the left of its parent.
 boolean attachesFromRight()
          Returns true if this ElemTree attaches or adjoins from the right of its parent.
 boolean dominates(ElemTree other)
          Returns true iff this ElemTree dominates the other tree in the sentence (this includes the case in which this.equals(other)).
 SpinalNode getAnchor()
          Returns the node representing the lexical anchor of this tree, or null if there is no anchor (in the case of a coordination structure).
 SpinalNode getAttachmentSite()
          Returns the SpinalNode in the parent tree to which this ElemTree is attached, or null if this ElemTree is the root of the derivation tree.
 int getAttachmentType()
          Returns the type of the attachment of this elementary tree to its parent tree, one of ATTACH, ADJOIN, CONJUNCT (only used in the LTAG-spinal treebank and in the output of the incremental parser), and CONJUNCT_OR_CONNECTIVE (only used in the output of the bidirectional parser).
 Iterator getChildren()
          Returns the ElemTrees that attach to this ElemTree.
 Iterator getChildrenSpans()
          Returns the spans of the ElemTrees that attach to this ElemTree.
 List getDominatedElemTrees()
          Returns a List of the elementary trees dominated by this elementary tree, including this tree itself.
 List getDominatedTerminals()
          Returns a List of the terminals attached to the elementary trees dominated by this elementary tree, including this tree itself.
 int getFileNumber()
          Returns the number of the file that contains the sentence to which this ElemTree belongs, or -1 if there is no such number.
 SpinalNode getFoot()
          Returns the node representing the foot node (if any) of this tree, or null if there is no foot node.
 int getNumber()
          Returns the number of this tree, corresponding to the position in the sentence.
 ElemTree getParent()
          Returns the ElemTree to which this ElemTree attaches, or null iff this is the root.
 PASLoc getPASLoc()
          Returns the location of the predicate-argument structure corresponding to this elementary tree in the Propbank, or null if this tree has no section and file numbers.
 String getPOS()
          Returns the part of speech of this elementary tree, or "NA" in the case of a coordination structure (where part of speech is not really applicable since Coord nodes aren't lexicalized).
 int getSectionNumber()
          Returns the number of the section that contains the sentence to which this ElemTree belongs, or -1 if there is no such number.
 Sentence getSentence()
          Return the Sentence to which this ElemTree belongs.
 int getSentenceNumber()
          Returns the number of the the sentence to which this ElemTree belongs, or -1 if there is no such number.
 int getSlot()
          Returns whether this elementary tree attaches to the LEFT or RIGHT side of its parent.
 WordSpan getSpan()
          Returns the part of the sentence that is spanned by the yield of this elementary tree and its descendents.
 SpinalNode getSpinalNodeAt(GornAddress g)
          Returns the spinal node at a given Gorn address in this tree.
 SpinalNode getSpine()
          Returns the spine of this tree.
 String getSurfaceString()
          Returns the yield of the subtree rooted in this elementary tree, with terminals separated by a single white space.
 String getTerminal()
          Returns the actual terminal string (word in most cases) represented by this ElemTree.
 int getType()
          Returns the type of this elementary tree, one of UNKNOWN, INITIAL, AUXILIARY, and COORD.
 String getTypeAsString()
          Returns a string representing the type of this elementary tree, one of initial, auxiliary, coordination, and unknown.
 boolean hasChildren()
          Returns true iff this ElemTree has other ElemTrees attached to it.
 boolean isAuxiliary()
          Returns true iff this tree is of type "auxiliary".
 boolean isBidirectionalParserOutput()
          Returns true if this ElemTree has been read in from the format used in the output of Shen's bidirectional parser.
 boolean isCoord()
          Returns true iff this tree is of type "coordination".
 boolean isEmptyElement()
          Returns if the anchor (if there is one) is an empty element by checking if the terminal has an unescaped asterisk.
 boolean isInitial()
          Returns true iff this tree is of type "initial".
 boolean isOfUnknownType()
          Returns true iff this tree is of unknown type.
 boolean isParentOf(ElemTree other)
          Returns true iff this ElemTree is the direct parent of the other tree in the derivation tree.
 boolean isRoot()
          Returns true iff this elementary tree has no parent.
 String toString()
          Converts this tree to a canonical string representation of the kind used in Libin Shen's thesis.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

UNKNOWN

public static final int UNKNOWN
Designates a field (tree, slot, etc.) of unknown type.

See Also:
Constant Field Values

INITIAL

public static final int INITIAL
Designates an initial tree.

See Also:
Constant Field Values

AUXILIARY

public static final int AUXILIARY
Designates an auxiliary tree.

See Also:
Constant Field Values

COORD

public static final int COORD
Designates a coordination tree -- these trees are special constructs and not anchored in a lexical item.

See Also:
Constant Field Values

ROOT

public static final int ROOT
A return value designating a special case when the current tree that has no parent.

See Also:
Constant Field Values

ATTACH

public static final int ATTACH
Designates a tree combined with its parent by an attachment operation, indicated by the keyword "att".

See Also:
Constant Field Values

ADJOIN

public static final int ADJOIN
Designates a tree combined with its parent by an adjunction operation, indicated by the keyword "adj".

See Also:
Constant Field Values

CONJUNCT

public static final int CONJUNCT
Designates a tree combined with its parent by an attachment operation, indicated by the keyword "crd" in the output of the incremental parser and in the LTAG-spinal treebank.

See Also:
Constant Field Values

CONJUNCT_OR_CONNECTIVE

public static final int CONJUNCT_OR_CONNECTIVE
Designates a tree combined with its parent by a conjunction or an attachment operation, indicated by the keyword "con" in the output of the incremental parser and in the LTAG-spinal treebank. In the treebank and in the output of the incremental parser, "crd" is used for conjuncts and "att" is used for connectives. In the output of the bidirectional parser, we don't distinguish conjuncts from connectives, in order to reduce an operation in parsing, so "con" is used to represent both.

See Also:
Constant Field Values

LEFT

public static final int LEFT
Designates a tree that attaches/adjoins on the left of its parent.

See Also:
Constant Field Values

RIGHT

public static final int RIGHT
Designates a tree that attaches/adjoins on the right of its parent.

See Also:
Constant Field Values
Constructor Detail

ElemTree

public ElemTree(Sentence container,
                String representation)
         throws ElemTreeFormatException
Creates an ElemTree from a string representation and attaches it into the provided derivation tree (Sentence). We assume that the nodes are numbered in ascending order of linear precedence of their anchors, as is the case in the LTAG-spinal treebank.

Parameters:
container - the Sentence to which this ElemTree is to be added
representation - the string that is to be parsed into an ElemTree
Throws:
ElemTreeFormatException - if the string representation is not well-formed
Method Detail

attachesFromLeft

public boolean attachesFromLeft()
Returns true if this ElemTree attaches or adjoins from the left of its parent.

Returns:
a boolean value
See Also:
getSlot()

attachesFromRight

public boolean attachesFromRight()
Returns true if this ElemTree attaches or adjoins from the right of its parent.

Returns:
a boolean value
See Also:
getSlot()

getAnchor

public SpinalNode getAnchor()
Returns the node representing the lexical anchor of this tree, or null if there is no anchor (in the case of a coordination structure).

Returns:
a SpinalNode element representing the lexical anchor

getAttachmentSite

public SpinalNode getAttachmentSite()
Returns the SpinalNode in the parent tree to which this ElemTree is attached, or null if this ElemTree is the root of the derivation tree.

Returns:
the parent's attachment site

getAttachmentType

public int getAttachmentType()
Returns the type of the attachment of this elementary tree to its parent tree, one of ATTACH, ADJOIN, CONJUNCT (only used in the LTAG-spinal treebank and in the output of the incremental parser), and CONJUNCT_OR_CONNECTIVE (only used in the output of the bidirectional parser). If this elementary tree is the root of the sentence, the special value ROOT is returned.

Returns:
the type of attachment of this ElemTree

getChildren

public Iterator getChildren()
Returns the ElemTrees that attach to this ElemTree.

Returns:
the children of this node in the derivation tree

getChildrenSpans

public Iterator getChildrenSpans()
Returns the spans of the ElemTrees that attach to this ElemTree.

Returns:
an iterator over (@link WordSpan} objects

getDominatedElemTrees

public List getDominatedElemTrees()
Returns a List of the elementary trees dominated by this elementary tree, including this tree itself.

Returns:
a list over ElemTree objects

getDominatedTerminals

public List getDominatedTerminals()
Returns a List of the terminals attached to the elementary trees dominated by this elementary tree, including this tree itself.

Returns:
a list of terminals (String objects)

getFileNumber

public int getFileNumber()
Returns the number of the file that contains the sentence to which this ElemTree belongs, or -1 if there is no such number.

Returns:
an int

getFoot

public SpinalNode getFoot()
Returns the node representing the foot node (if any) of this tree, or null if there is no foot node. (Only auxiliary trees have foot nodes.)

Returns:
a SpinalNode

getNumber

public int getNumber()
Returns the number of this tree, corresponding to the position in the sentence.

Returns:
the number, or -1 if unknown

getPASLoc

public PASLoc getPASLoc()
Returns the location of the predicate-argument structure corresponding to this elementary tree in the Propbank, or null if this tree has no section and file numbers. This will be a string of the following shape: wsj/<section>/wsj_<section><file>.mrg.

Returns:
a string that indicates the location of this word in the Propbank

getPOS

public String getPOS()
Returns the part of speech of this elementary tree, or "NA" in the case of a coordination structure (where part of speech is not really applicable since Coord nodes aren't lexicalized).

Returns:
the POS tag of the current word

getParent

public ElemTree getParent()
Returns the ElemTree to which this ElemTree attaches, or null iff this is the root.

Returns:
the parent of this node in the derivation tree

getSectionNumber

public int getSectionNumber()
Returns the number of the section that contains the sentence to which this ElemTree belongs, or -1 if there is no such number.

Returns:
an int

getSentence

public Sentence getSentence()
Return the Sentence to which this ElemTree belongs.

Returns:
a Sentence object

getSentenceNumber

public int getSentenceNumber()
Returns the number of the the sentence to which this ElemTree belongs, or -1 if there is no such number.

Returns:
an int

getSlot

public int getSlot()
Returns whether this elementary tree attaches to the LEFT or RIGHT side of its parent. Returns ROOT if this tree is the root of the sentence and UNKNOWN if the value is not known.

Returns:
one of the given values
See Also:
attachesFromLeft(), attachesFromRight()

getSpan

public WordSpan getSpan()
Returns the part of the sentence that is spanned by the yield of this elementary tree and its descendents. If the span is discontinuous, the left- and rightmost ElemTree locations are given.

Returns:
a WordSpan object

getSpinalNodeAt

public SpinalNode getSpinalNodeAt(GornAddress g)
Returns the spinal node at a given Gorn address in this tree.

Parameters:
g - a GornAddress
Returns:
a SpinalNode

getSpine

public SpinalNode getSpine()
Returns the spine of this tree.

Returns:
a SpinalNode representing the root of the spine

getSurfaceString

public String getSurfaceString()
Returns the yield of the subtree rooted in this elementary tree, with terminals separated by a single white space. Empty elements are included.

Returns:
a substring corresponding to the current ElemTree and all its descendents.

getTerminal

public String getTerminal()
Returns the actual terminal string (word in most cases) represented by this ElemTree.

Returns:
the terminal string at the fringe of this elementary tree

getType

public int getType()
Returns the type of this elementary tree, one of UNKNOWN, INITIAL, AUXILIARY, and COORD.

Returns:
the type of this ElemTree

getTypeAsString

public String getTypeAsString()
Returns a string representing the type of this elementary tree, one of initial, auxiliary, coordination, and unknown.

Returns:
the type of this ElemTree in string representation

isCoord

public boolean isCoord()
Returns true iff this tree is of type "coordination".

Returns:
a boolean value

isAuxiliary

public boolean isAuxiliary()
Returns true iff this tree is of type "auxiliary".

Returns:
a boolean value

isInitial

public boolean isInitial()
Returns true iff this tree is of type "initial".

Returns:
a boolean value

isOfUnknownType

public boolean isOfUnknownType()
Returns true iff this tree is of unknown type.

Returns:
a boolean value

isEmptyElement

public boolean isEmptyElement()
Returns if the anchor (if there is one) is an empty element by checking if the terminal has an unescaped asterisk. Returns false if this is a coordination node, in which case there is no anchor anyway.

Returns:
a boolean value

isRoot

public boolean isRoot()
Returns true iff this elementary tree has no parent.

Returns:
a boolean value

hasChildren

public boolean hasChildren()
Returns true iff this ElemTree has other ElemTrees attached to it.

Returns:
a boolean value

toString

public String toString()
Converts this tree to a canonical string representation of the kind used in Libin Shen's thesis.

Overrides:
toString in class Object
Returns:
a string representation

dominates

public boolean dominates(ElemTree other)
Returns true iff this ElemTree dominates the other tree in the sentence (this includes the case in which this.equals(other)).

Parameters:
other - the other tree
Returns:
a boolean value

isParentOf

public boolean isParentOf(ElemTree other)
Returns true iff this ElemTree is the direct parent of the other tree in the derivation tree.

Parameters:
other - the other tree
Returns:
a boolean value

isBidirectionalParserOutput

public boolean isBidirectionalParserOutput()
Returns true if this ElemTree has been read in from the format used in the output of Shen's bidirectional parser. If this is the case, no information about the spine is present.

Returns:
a boolean value