Paper: Learning Phrase Boundaries for Hierarchical Phrase-based Translation

ACL ID C10-2044
Title Learning Phrase Boundaries for Hierarchical Phrase-based Translation
Venue International Conference on Computational Linguistics
Session Poster Session
Year 2010
Authors

Hierarchical phrase-based models pro- vide a powerful mechanism to capture non-local phrase reorderings for statis- tical machine translation (SMT). How- ever, many phrase reorderings are arbi- trary because the models are weak on de- termining phrase boundaries for pattern- matching. This paper presents a novel approach to learn phrase boundaries di- rectly from word-aligned corpus without using any syntactical information. We use phrase boundaries, which indicate the be- ginning/ending of phrase reordering, as soft constraints for decoding. Experi- mental results and analysis show that the approach yields significant improvements over the baseline on large-scale Chinese- to-English translation.