Paper: Human Judgements in Parallel Treebank Alignment

ACL ID W08-1208
Title Human Judgements in Parallel Treebank Alignment
Venue Coling 2008: Proceedings of the 2nd workshop on Information Retrieval for Question Answering
Year 2008

We have built a parallel treebank that includes word and phrase alignment. The alignment information was manually checked using a graphical tool that al- lows the annotator to view a pair of trees from parallel sentences. We found the compilation of clear alignment guidelines to be a difficult task. However, experi- ments with a group of students have shown that we are on the right track with up to 89% overlap between the student annota- tion and our own. At the same time these experiments have helped us to pin-point the weaknesses in the guidelines, many of which concerned unclear rules related to differences in grammatical forms between the languages.