Paper: An Automatic Treebank Conversion Algorithm For Corpus Sharing

ACL ID P94-1034
Title An Automatic Treebank Conversion Algorithm For Corpus Sharing
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 1994
Authors

An automatic treebank conversion method is pro- posed in this paper to convert a treebank into an- other treebank. A new treebank associated with a different grammar can be generated automati- cally from the old one such that the information in the original treebank can be transformed to the new one and be shared among different research communities. The simple algorithm achieves con- version accuracy of 96.4% when tested on 8,867 sentences between two major grammar revisions of a large MT system. Motivation Corpus-based research is now a major branch for language processing. One major resource for corpus-based research is the treebanks available in many research organizations [Marcus et al.1993], which carry skeletal syntactic structures or 'brack- ets' that have been manually verified. U...