Paper: Learning Topic Representation for SMT with Neural Networks

ACL ID P14-1013
Title Learning Topic Representation for SMT with Neural Networks
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2014

Statistical Machine Translation (SMT) usually utilizes contextual information to disambiguate translation candidates. However, it is often limited to contexts within sentence boundaries, hence broader topical information cannot be leveraged. In this paper, we propose a novel approach to learning topic representation for paral- lel data using a neural network architec- ture, where abundant topical contexts are embedded via topic relevant monolingual data. By associating each translation rule with the topic representation, topic rele- vant rules are selected according to the dis- tributional similarity with the source text during SMT decoding. Experimental re- sults show that our method significantly improves translation accuracy in the NIST Chinese-to-English translation task com- pared to ...