Paper: Term-list Translation using Mono-lingual Word Co-occurrence Vectors

ACL ID C98-1106
Title Term-list Translation using Mono-lingual Word Co-occurrence Vectors
Venue International Conference on Computational Linguistics
Session Main Conference
Year 1998
Authors
  • Genichiro Kikui (NTT Information and Communication Systems Laboratories, Yokosuka Japan)

A term-list is a list of content words that charac- terize a consistent text or a concept. This paper presents a new method for translating a term-list by using a corpus in the target language. The method first retrieves alternative translations for each input word from a bilingual dictionary. It then determines the most 'coherent' combination of alternative trans- lations, where the coherence of a set of words is defined as the proximity among multi-dimensional vectors produced from the words on the basis of co-occurrence statistics. The method was applied to term-lists extracted from newspaper articles and achieved 81% translation accuracy for ambiguous words (i.e., words with multiple translations).