ACL Anthology Network (All About NLP) (beta) The Association Of Computational Linguistics Anthology Network |
ACL ID | N01-1014 |
---|---|
Title | Identifying Cognates By Phonetic And Semantic Similarity |
Venue | Annual Conference of the North American Chapter of the Association for Computational Linguistics |
Session | Main Conference |
Year | 2001 |
Authors |
|
I present a method of identifying cognates in the vo- cabularies of related languages. I show that a mea- sure of phonetic similarity based on multivalued fea- tures performs better than “orthographic” measures, such as the Longest Common Subsequence Ratio (LCSR) or Dice’s coefficient. I introduce a proce- dure for estimating semantic similarity of glosses that employs keyword selection and WordNet. Tests performed on vocabularies of four Algonquian lan- guages indicate that the method is capable of discov- ering on average nearly 75% percent of cognates at 50% precision.