Paper: BiTAM: Bilingual Topic AdMixture Models For Word Alignment

ACL ID P06-2124
Title BiTAM: Bilingual Topic AdMixture Models For Word Alignment
Venue Annual Meeting of the Association of Computational Linguistics
Session Poster Session
Year 2006

We propose a novel bilingual topical ad- mixture (BiTAM) formalism for word alignment in statistical machine transla- tion. Under this formalism, the paral- lel sentence-pairs within a document-pair are assumed to constitute a mixture of hidden topics; each word-pair follows a topic-specific bilingual translation model. Three BiTAM models are proposed to cap- ture topic sharing at different levels of lin- guistic granularity (i.e. , at the sentence or word levels). These models enable word- alignment process to leverage topical con- tents of document-pairs. Efficient vari- ational approximation algorithms are de- signed for inference and parameter esti- mation. With the inferred latent topics, BiTAM models facilitate coherent pairing of bilingual linguistic entities that share common topic...