Paper: Discriminative Substring Decoding for Transliteration

ACL ID D09-1111
Title Discriminative Substring Decoding for Transliteration
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2009

We present a discriminative substring de- coder for transliteration. This decoder extends recent approaches for discrimi- native character transduction by allow- ing for a list of known target-language words, an important resource for translit- eration. Our approach improves upon Sherif and Kondrak’s (2007b) state-of-the- art decoder, creating a 28.5% relative im- provement in transliteration accuracy on a Japanese katakana-to-English task. We also conduct a controlled comparison of two feature paradigms for discriminative training: indicators and hybrid generative features. Surprisingly, the generative hy- brid outperforms its purely discriminative counterpart, despite losing access to rich source-context features. Finally, we show that machine transliterations have a posi- tive impact ...