Paper: Automatic Transliteration of Romanized Dialectal Arabic

ACL ID W14-1604
Title Automatic Transliteration of Romanized Dialectal Arabic
Venue International Conference on Computational Natural Language Learning
Year 2014

In this paper, we address the problem of converting Dialectal Arabic (DA) text that is written in the Latin script (called Arabizi) into Arabic script following the CODA convention for DA orthography. The presented system uses a finite state transducer trained at the character level to generate all possible transliterations for the input Arabizi words. We then filter the generated list using a DA morpholog- ical analyzer. After that we pick the best choice for each input word using a lan- guage model. We achieve an accuracy of 69.4% on an unseen test set compared to 63.1% using a system which represents a previously proposed approach.