Paper: Fixed Length Word Suffix for Factored Statistical Machine Translation

ACL ID P10-2027
Title Fixed Length Word Suffix for Factored Statistical Machine Translation
Venue Annual Meeting of the Association of Computational Linguistics
Session Short Paper
Year 2010
Authors

Factored Statistical Machine Translation ex- tends the Phrase Based SMT model by al- lowing each word to be a vector of factors. Experiments have shown effectiveness of many factors, including the Part of Speech tags in improving the grammaticality of the output. However, high quality part of speech taggers are not available in open domain for many languages. In this paper we used fixed length word suffix as a new factor in the Factored SMT, and were able to achieve significant improvements in three set of experiments: large NIST Arabic to English system, medium WMT Spanish to English system, and small TRANSTAC English to Iraqi system.