Paper: Genre distinctions for discourse in the Penn TreeBank

ACL ID P09-1076
Title Genre distinctions for discourse in the Penn TreeBank
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2009

Articles in the Penn TreeBank were iden- tified as being reviews, summaries, let- ters to the editor, news reportage, correc- tions, wit and short verse, or quarterly profit reports. All but the latter three were then characterised in terms of fea- tures manually annotated in the Penn Dis- course TreeBank — discourse connectives and their senses. Summaries turned out to display very different discourse features than the other three genres. Letters also appeared to have some different features. The two main findings involve (1) differ- ences between genres in the senses asso- ciated with intra-sentential discourse con- nectives, inter-sentential discourse con- nectives and inter-sentential discourse re- lations that are not lexically marked; and (2) differences within all four genres be- ...