Paper: Data point selection for cross-language adaptation of dependency parsers

ACL ID P11-2120
Title Data point selection for cross-language adaptation of dependency parsers
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2011
Authors

We consider a very simple, yet effective, ap- proach to cross language adaptation of depen- dency parsers. We first remove lexical items from the treebanks and map part-of-speech tags into a common tagset. We then train a language model on tag sequences in otherwise unlabeled target data and rank labeled source data by perplexity per word of tag sequences from less similar to most similar to the target. We then train our target language parser on the most similar data points in the source la- beled data. The strategy achieves much better results than a non-adapted baseline and state- of-the-art unsupervised dependency parsing, and results are comparable to more complex projection-based cross language adaptation al- gorithms.