Paper: Building a Hierarchically Aligned Chinese-English Parallel Treebank

ACL ID C14-1143
Title Building a Hierarchically Aligned Chinese-English Parallel Treebank
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2014
Authors

We construct a hierarchically aligned Chinese-English parallel treebank by manually doing word alignments and phrase alignments simultaneously on parallel phrase-based parse trees. The main innovation of our approach is that we leave words without a translation counterpart (which are mostly language-particular function words) unaligned on the word level, and locate and align the appropriate phrases which encapsulate them. In doing so, we harmonize word-level and phrase- level alignments. We show that this type of annotation can be performedwith high inter-annotator consistency and have both linguistic and engineering potentials.