Paper: Is Arabic Part of Speech Tagging Feasible Without Word Segmentation?

ACL ID N10-1105
Title Is Arabic Part of Speech Tagging Feasible Without Word Segmentation?
Venue Human Language Technologies
Session Main Conference
Year 2010
Authors

In this paper, we compare two novel methods for part of speech tagging of Arabic without the use of gold standard word segmentation but with the full POS tagset of the Penn Ara- bic Treebank. The first approach uses com- plex tags without any word segmentation, the second approach is segmention-based, using a machine learning segmenter. Surprisingly, word-based POS tagging yields the best re- sults, with a word accuracy of 94.74%.