Paper: Computing Word Similarity And Identifying Cognates With Pair Hidden Markov Models

ACL ID W05-0606
Title Computing Word Similarity And Identifying Cognates With Pair Hidden Markov Models
Venue International Conference on Computational Natural Language Learning
Session Main Conference
Year 2005
Authors

We present a system for computing sim- ilarity between pairs of words. Our sys- tem is based on Pair Hidden Markov Mod- els, a variation on Hidden Markov Mod- els that has been used successfully for the alignment of biological sequences. The parameters of the model are automatically learned from training data that consists of word pairs known to be similar. Our tests focus on the identification of cog- nates — words of common origin in re- lated languages. The results show that our system outperforms previously proposed techniques.