Paper: Machine Translation Detection from Monolingual Web-Text

ACL ID P13-1157
Title Machine Translation Detection from Monolingual Web-Text
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2013
Authors

We propose a method for automatically detecting low-quality Web-text translated by statistical machine translation (SMT) systems. We focus on the phrase salad phenomenon that is observed in existing SMT results and propose a set of computa- tionally inexpensive features to effectively detect such machine-translated sentences from a large-scale Web-mined text. Un- like previous approaches that require bilin- gual data, our method uses only monolin- gual text as input; therefore it is applicable for refining data produced by a variety of Web-mining activities. Evaluation results show that the proposed method achieves an accuracy of 95.8% for sentences and 80.6% for text in noisy Web pages.