Paper: A Phrase-Based Unigram Model For Statistical Machine Translation

ACL ID N03-2036
Title A Phrase-Based Unigram Model For Statistical Machine Translation
Venue Human Language Technologies
Session Short Paper
Year 2003
Authors

In this paper, we describe a phrase-based un- igram model for statistical machine transla- tion that uses a much simpler set of model parameters than similar phrase-based models. The units of translation are blocks - pairs of phrases. During decoding, we use a block un- igram model and a word-based trigram lan- guage model. During training, the blocks are learned from source interval projections using an underlying word alignment. We show exper- imental results on block selection criteria based on unigram counts and phrase length. 1 Phrase-based Unigram Model Various papers use phrase-based translation systems (Och et al. , 1999; Marcu and Wong, 2002; Yamada and Knight, 2002) that have shown to improve translation quality over single-word based translation systems introduced in (Brown et a...