Paper: How Much Can We Gain from Supervised Word Alignment?

ACL ID P11-2029
Title How Much Can We Gain from Supervised Word Alignment?
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2011

Word alignment is a central problem in sta- tistical machine translation (SMT). In re- cent years, supervised alignment algo- rithms, which improve alignment accuracy by mimicking human alignment, have at- tracted a great deal of attention. The objec- tive of this work is to explore the perform- ance limit of supervised alignment under the current SMT paradigm. Our experi- ments used a manually aligned Chinese- English corpus with 280K words recently released by the Linguistic Data Consortium (LDC). We treated the human alignment as the oracle of supervised alignment. The re- sult is surprising: the gain of human alignment over a state of the art unsuper- vised method (GIZA++) is less than 1 point in BLEU. Furthermore, we showed the benefit of improved alignment becomes smaller ...