Paper: Bilingual Lexicon Extraction from Comparable Corpora Enhanced with Parallel Corpora

ACL ID W11-1205
Title Bilingual Lexicon Extraction from Comparable Corpora Enhanced with Parallel Corpora
Venue Building and Using Comparable Corpora
Session
Year 2011
Authors

In this article, we present a simple and ef- fective approach for extracting bilingual lex- icon from comparable corpora enhanced with parallel corpora. We make use of structural characteristics of the documents comprising the comparable corpus to extract parallel sen- tences with a high degree of quality. We then use state-of-the-art techniques to build a spe- cialized bilingual lexicon from these sentences and evaluate the contribution of this lexicon when added to the comparable corpus-based alignment technique. Finally, the value of this approach is demonstrated by the improvement of translation accuracy for medical words.