Paper: Improve SMT Quality with Automatically Extracted Paraphrase Rules

ACL ID P12-1103
Title Improve SMT Quality with Automatically Extracted Paraphrase Rules
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2012
Authors

Abstract1 We propose a novel approach to improve SMT via paraphrase rules which are automatically extracted from the bilingual training data. Without using extra paraphrase resources, we acquire the rules by comparing the source side of the parallel corpus with the target-to-source translations of the target side. Besides the word and phrase paraphrases, the acquired paraphrase rules mainly cover the structured paraphrases on the sentence level. These rules are employed to enrich the SMT inputs for translation quality improvement. The experimental results show that our proposed approach achieves significant improvements of 1.6~3.6 points of BLEU in the oral domain and 0.5~1 points in the news domain.