Paper: Unsupervised Word Segmentation Improves Dialectal Arabic to English Machine Translation

ACL ID W14-3628
Title Unsupervised Word Segmentation Improves Dialectal Arabic to English Machine Translation
Venue Workshop on Arabic Natural Language Processing
Session
Year 2014
Authors

We demonstrate the feasibility of using unsupervised morphological segmentation for dialects of Arabic, which are poor in linguistics resources. Our experiments us- ing a Qatari Arabic to English machine translation system show that unsupervised segmentation helps to improve the transla- tion quality as compared to using no seg- mentation or to using ATB segmentation, which was especially designed for Mod- ern Standard Arabic (MSA). We use MSA and other dialects to improve Qatari Ara- bic to English machine translation, and we show that a uniform segmentation scheme across them yields an improvement of 1.5 BLEU points over using no segmentation.