Paper: A More Discerning and Adaptable Multilingual Transliteration Mechanism for Indian Languages

ACL ID I08-1009
Title A More Discerning and Adaptable Multilingual Transliteration Mechanism for Indian Languages
Venue International Joint Conference on Natural Language Processing
Session Main Conference
Year 2008
Authors

Transliteration is the process of transcribing words from a source script to a target script. These words can be content words or proper nouns. They may be of local or foreign ori- gin. In this paper we present a more dis- cerning method which applies different tech- niques based on the word origin. The tech- niques used also take into account the prop- erties of the scripts. Our approach does not require training data on the target side, while it uses more sophisticated techniques on the source side. Fuzzy string matching is used to compensate for lack of training on the target side. We have evaluated on two Indian lan- guages and have achieved substantially bet- ter results (increase of up to 0.44 in MRR) than the baseline and comparable to the state of the art. Our experiments clearly s...