Paper: Effects of Empty Categories on Machine Translation

ACL ID D10-1062
Title Effects of Empty Categories on Machine Translation
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2010

We examine effects that empty categories have on machine translation. Empty categories are elements in parse trees that lack corresponding overt surface forms (words) such as dropped pronouns and markers for control construc- tions. We start by training machine trans- lation systems with manually inserted empty elements. We find that inclusion of some empty categories in training data improves the translation result. We expand the experiment by automatically inserting these elements into a larger data set using various methods and training on the modified corpus. We show that even when automatic prediction of null ele- ments is not highly accurate, it nevertheless improves the end translation result.