Paper: Insertion and Deletion Models for Statistical Machine Translation

ACL ID N12-1035
Title Insertion and Deletion Models for Statistical Machine Translation
Venue Annual Conference of the North American Chapter of the Association for Computational Linguistics
Session Main Conference
Year 2012
Authors

We investigate insertion and deletion models for hierarchical phrase-based statistical ma- chine translation. Insertion and deletion mod- els are designed as a means to avoid the omis- sion of content words in the hypotheses. In our case, they are implemented as phrase-level feature functions which count the number of inserted or deleted words. An English word is considered inserted or deleted based on lex- ical probabilities with the words on the for- eign language side of the phrase. Related tech- niques have been employed before by Och et al. (2003) in an n-best reranking framework and by Mauser et al. (2006) and Zens (2008) in a standard phrase-based translation system. We propose novel thresholding methods in this work and study insertion and deletion fea- tures which are based on two...