Paper: Lattice Desegmentation for Statistical Machine Translation

ACL ID P14-1010
Title Lattice Desegmentation for Statistical Machine Translation
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2014
Authors

Morphological segmentation is an effec- tive sparsity reduction strategy for statis- tical machine translation (SMT) involv- ing morphologically complex languages. When translating into a segmented lan- guage, an extra step is required to deseg- ment the output; previous studies have de- segmented the 1-best output from the de- coder. In this paper, we expand our trans- lation options by desegmenting n-best lists or lattices. Our novel lattice desegmenta- tion algorithm effectively combines both segmented and desegmented views of the target language for a large subspace of possible translation outputs, which allows for inclusion of features related to the de- segmentation process, as well as an un- segmented language model (LM). We in- vestigate this technique in the context of English-to-...