Paper: Improving MT System Using Extracted Parallel Fragments of Text from Comparable Corpora

ACL ID W13-2509
Title Improving MT System Using Extracted Parallel Fragments of Text from Comparable Corpora
Venue Building and Using Comparable Corpora
Session
Year 2013
Authors

In this article, we present an automated ap- proach of extracting English-Bengali parallel fragments of text from comparable corpora created using Wikipedia documents. Our ap- proach exploits the multilingualism of Wiki- pedia. The most important fact is that this ap- proach does not need any domain specific cor- pus. We have been able to improve the BLEU score of an existing domain specific English- Bengali machine translation system by 11.14%.