ACL Anthology Network (All About NLP) (beta) The Association Of Computational Linguistics Anthology Network |
ACL ID | P99-1067 |
---|---|
Title | Automatic Identification Of Word Translations From Unrelated English And German Corpora |
Venue | Annual Meeting of the Association of Computational Linguistics |
Session | Main Conference |
Year | 1999 |
Authors |
|
Algorithms for the alignment of words in translated texts are well established. How- ever, only recently new approaches have been proposed to identify word translations from non-parallel or even unrelated texts. This task is more difficult, because most statistical clues useful in the processing of parallel texts cannot be applied to non-par- allel texts. Whereas for parallel texts in some studies up to 99% of the word align- ments have been shown to be correct, the accuracy for non-parallel texts has been around 30% up to now. The current study, which is based on the assumption that there is a correlation between the patterns of word co-occurrences in corpora of different lan- guages, makes a significant improvement to about 72% of word translations identified correctly.