Paper: Vector Space Model for Adaptation in Statistical Machine Translation

ACL ID P13-1126
Title Vector Space Model for Adaptation in Statistical Machine Translation
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2013
Authors

This paper proposes a new approach to domain adaptation in statistical machine translation (SMT) based on a vector space model (VSM). The general idea is first to create a vector profile for the in-domain development (?dev?) set. This profile might, for instance, be a vector with a di- mensionality equal to the number of train- ing subcorpora; each entry in the vector re- flects the contribution of a particular sub- corpus to all the phrase pairs that can be extracted from the dev set. Then, for each phrase pair extracted from the train- ing data, we create a vector with features defined in the same way, and calculate its similarity score with the vector represent- ing the dev set. Thus, we obtain a de- coding feature whose value represents the phrase pair?s closeness to the dev. This is a...