Paper: A Word-Class Approach to Labeling PSCFG Rules for Machine Translation

ACL ID P11-1001
Title A Word-Class Approach to Labeling PSCFG Rules for Machine Translation
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2011
Authors

In this work we propose methods to label probabilistic synchronous context-free gram- mar (PSCFG) rules using only word tags, generated by either part-of-speech analysis or unsupervised word class induction. The proposals range from simple tag-combination schemes to a phrase clustering model that can incorporate an arbitrary number of features. Our models improve translation quality over the single generic label approach of Chiang (2005) and perform on par with the syntacti- cally motivated approach from Zollmann and Venugopal (2006) on the NIST large Chinese- to-English translation task. These results per- sist when using automatically learned word tags, suggesting broad applicability of our technique across diverse language pairs for which syntactic resources are not available.