Paper: Learning a Phrase-based Translation Model from Monolingual Data with Application to Domain Adaptation

ACL ID P13-1140
Title Learning a Phrase-based Translation Model from Monolingual Data with Application to Domain Adaptation
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2013
Authors

Currently, almost all of the statistical ma- chine translation (SMT) models are trained with the parallel corpora in some specific domains. However, when it comes to a lan- guage pair or a different domain without any bilingual resources, the traditional SMT loses its power. Recently, some research works study the unsupervised SMT for in- ducing a simple word-based translation model from the monolingual corpora. It successfully bypasses the constraint of bitext for SMT and obtains a relatively promising result. In this paper, we take a step forward and propose a simple but effec- tive method to induce a phrase-based model from the monolingual corpora given an au- tomatically-induced translation lexicon or a manually-edited translation dictionary. We apply our method for the do...