This page contains the pilot corpus for sentence specificity as described in our LREC 16 paper Improving the Annotation of Sentence Specificity. The dataset consists of annotation on each sentence from 16 New York Times articles and has the following major components:
- Rating of sentence specificity;
- Marking of underspecified segments;
- Free text questions for each marked segment;
- Whether the answer to the question is: in immediate context, in prior context, topically related or in no context.
Junyi Jessy Li, Bridget O'Daniel, Yi Wu, Wenli Zhao and Ani Nenkova. 2016. Improving the Annotation of Sentence Specificity. In Proceedings of LREC. [pdf] [bib]
Please send comments and feedback to J. Jessy Li.