Paper: An Efficient Method For Determining Bilingual Word Classes

ACL ID E99-1010
Title An Efficient Method For Determining Bilingual Word Classes
Venue Annual Meeting of The European Chapter of The Association of Computational Linguistics
Session Main Conference
Year 1999
Authors

In statistical natural language process- ing we always face the problem of sparse data. One way to reduce this problem is to group words into equivalence classes which is a standard method in statistical language modeling. In this paper we de- scribe a method to determine bilingual word classes suitable for statistical ma- chine translation. We develop an opti- mization criterion based on a maximum- likelihood approach and describe a clus- tering algorithm. We will show that the usage of the bilingual word classes we get can improve statistical machine transla- tion.