Paper: Simple Syntactic and Morphological Processing Can Help English-Hindi Statistical Machine Translation

ACL ID I08-1067
Title Simple Syntactic and Morphological Processing Can Help English-Hindi Statistical Machine Translation
Venue International Joint Conference on Natural Language Processing
Session Main Conference
Year 2008
Authors

In this paper, we report our work on incor- porating syntactic and morphological infor- mation for English to Hindi statistical ma- chine translation. Two simple and compu- tationally inexpensive ideas have proven to be surprisingly effective: (i) reordering the English source sentence as per Hindi syntax, and (ii) using the suffixes of Hindi words. The former is done by applying simple trans- formation rules on the English parse tree. The latter, by using a simple suffix separa- tion program. With only a small amount of bilingual training data and limited tools for Hindi, we achieve reasonable performance and substantial improvements over the base- line phrase-based system. Our approach es- chews the use of parsing or other sophisti- cated linguistic tools for the target language (Hindi) ...