Paper: Bilingual-LSA Based LM Adaptation for Spoken Language Translation

ACL ID P07-1066
Title Bilingual-LSA Based LM Adaptation for Spoken Language Translation
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2007
Authors

We propose a novel approach to crosslingual language model (LM) adaptation based on bilingual Latent Semantic Analysis (bLSA). A bLSA model is introduced which enables latent topic distributions to be efficiently transferred across languages by enforcing a one-to-one topic correspondence during training. Using the proposed bLSA frame- work crosslingual LM adaptation can be per- formed by, first, inferring the topic poste- rior distribution of the source text and then applying the inferred distribution to the tar- get language N-gram LM via marginal adap- tation. The proposed framework also en- ables rapid bootstrapping of LSA models for new languages based on a source LSA model from another language. On Chinese to English speech and text translation the proposed bLSA framework successfully...