Paper: Supervised Bilingual Lexicon Induction with Multiple Monolingual Signals

ACL ID N13-1056
Title Supervised Bilingual Lexicon Induction with Multiple Monolingual Signals
Venue Annual Conference of the North American Chapter of the Association for Computational Linguistics
Session Main Conference
Year 2013
Authors

University of PennsylvaniaAbstract Prior research into learning translations from source and target language monolingual texts has treated the task as an unsupervised learn- ing problem. Although many techniques take advantage of a seed bilingual lexicon, this work is the first to use that data for super- vised learning to combine a diverse set of sig- nals derived from a pair of monolingual cor- pora into a single discriminative model. Even in a low resource machine translation setting, where induced translations have the potential to improve performance substantially, it is rea- sonable to assume access to some amount of data to perform this kind of optimization. Our work shows that only a few hundred transla- tion pairs are needed to achieve strong per- formance on the bilingual lexicon...