This tool is built to automatically identify explicit discourse connectives and their sense (Expansion, Contingency, Comparison, Temporal).
Cue phrases can be ambiguous between discourse and non-discourse usage:
John likes to run marathons, and ran 10 last year alone. (Expansion)
My favorite colors are blue and green. (Non-discourse)
Connectives can also be ambiguous between multiple senses:
They have not spoken to each other since they saw each other last fall. (Temporal)
I assumed you were not coming since you never replied to the invitation. (Contingency/Causal)
This tool is based on the work described in:
Emily Pitler and Ani Nenkova. Using Syntax to Disambiguate Explicit Discourse Connectives in Text. Proceedings of ACL, short paper, 2009.
It is designed to work with the output of automatic parsers or gold-standard parses, in either pretty-printed or one sentence per line format. It outputs the syntactic trees augmented with tags indicating discourse connectives. addDiscourse was trained on sections 2-21 of version 2 of the Penn Discourse Treebank and Penn Treebank.
There are two differences in the feature set described in "Using Syntax to Disambiguate Explicit Discourse Connectives in Text" and the addDiscourse tool, both designed to make the tool more compatible with parses produced by automatic parsers. The tool no longer uses the presence of traces in the right sibling as a feature, since most automatic parses today do not include traces. In addition, it no longer uses functional tags in syntactic categories, since automatic parsers strip those out (i.e., now SBAR-TMP is equivalent to SBAR).
View the README