Paper: Extraction Of Lexical Translations From Non-Aligned Corpora

ACL ID C96-2098
Title Extraction Of Lexical Translations From Non-Aligned Corpora
Venue International Conference on Computational Linguistics
Session Main Conference
Year 1996

A method for extracting lexical trans- lations from non-aligned corpora is pro- posed to cope with the unavailability of large aligned corpus. The assumption that "translations of two co-occurring words in a source language also co-occur in the target language" is adopted and represented in the stochastic matrix for- mulation. The translation matrix pro- vides the co-occurring information trans- lated from the source into the target. This translated co-occurring information should resemble that of the original in the target when the ambiguity of the translational relation is resolved. An al- gorithm to obtain the best translation matrix is introduced. Some experiments were performed to evaluate the effective- ness of the ambiguity resolution and the refinement of the dictionary.