Paper: Loss-Sensitive Discriminative Training of Machine Transliteration Models

ACL ID N09-3011
Title Loss-Sensitive Discriminative Training of Machine Transliteration Models
Venue HLT-NAACL Companion Volume: Student Research Workshop and Doctoral Consortium
Session
Year 2009
Authors

In machine transliteration we transcribe a name across languages while maintaining its phonetic information. In this paper, we present a novel sequence transduction algo- rithm for the problem of machine transliter- ation. Our model is discriminatively trained by the MIRA algorithm, which improves the traditional Perceptron training in three ways: (1) It allows us to consider k-best translitera- tions instead of the best one. (2) It is trained based on the ranking of these transliterations according to user-specified loss function (Lev- enshtein edit distance). (3) It enables the user to tune a built-in parameter to cope with noisy non-separable data during training. On an Arabic-English name transliteration task, our model achieves a relative error reduction of 2.2% over a perceptron-base...