Paper: Building Chinese Discourse Corpus with Connective-driven Dependency Tree Structure

ACL ID D14-1224
Title Building Chinese Discourse Corpus with Connective-driven Dependency Tree Structure
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2014
Authors

In this paper, we propose a Connective- driven Dependency Tree (CDT) scheme to represent the discourse rhetorical structure in Chinese language, with ele- mentary discourse units as leaf nodes and connectives as non-leaf nodes, large- ly motivated by the Penn Discourse Treebank and the Rhetorical Structure Theory. In particular, connectives are employed to directly represent the hier- archy of the tree structure and the rhetor- ical relation of a discourse, while the nu- clei of discourse units are globally de- termined with reference to the depend- ency theory. Guided by the CDT scheme, we manually annotate a Chinese Dis- course Treebank (CDTB) of 500 docu- ments. Preliminary evaluation justifies the appropriateness of the CDT scheme to Chinese discourse analysis and the usefuln...