Paper: Bilingual Lexicon Extraction from Comparable Corpora Using Label Propagation

ACL ID D12-1003
Title Bilingual Lexicon Extraction from Comparable Corpora Using Label Propagation
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2012
Authors

This paper proposes a novel method for lex- icon extraction that extracts translation pairs from comparable corpora by using graph- based label propagation. In previous work, it was established that performance drasti- cally decreases when the coverage of a seed lexicon is small. We resolve this problem by utilizing indirect relations with the bilin- gual seeds together with direct relations, in which each word is represented by a distri- bution of translated seeds. The seed distri- butions are propagated over a graph repre- senting relations among words, and transla- tion pairs are extracted by identifying word pairs with a high similarity in the seed dis- tributions. We propose two types of the graphs: a co-occurrence graph, representing co-occurrence relations between words, and a simil...