Paper: A Comparative Evaluation Of Data-Driven Models In Translation Selection Of Machine Translation

ACL ID C02-1072
Title A Comparative Evaluation Of Data-Driven Models In Translation Selection Of Machine Translation
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2002
Authors

We present a comparative evaluation of two data-driven models used in translation selec- tionofEnglish-Koreanmachinetranslation. La- tent semantic analysis(LSA) and probabilistic latent semantic analysis (PLSA) are applied for the purpose of implementation of data-driven models in particular. These models are able to represent complex semantic structures of given contexts, like text passages. Grammatical rela- tionships, stored in dictionaries, are utilized in translation selection essentially. We have used k-nearest neighbor (k-NN) learning to select an appropriate translation of the unseen instances in the dictionary. The distance of instances in k-NN is computed by estimating the similar- ity measured by LSA and PLSA. For experi- ments, we used TREC data(AP news in 1988) for constructin...