Paper: 3arif: A Corpus of Modern Standard and Egyptian Arabic Tweets Annotated for Epistemic Modality Using Interactive Crowdsourcing

ACL ID C14-1144
Title 3arif: A Corpus of Modern Standard and Egyptian Arabic Tweets Annotated for Epistemic Modality Using Interactive Crowdsourcing
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2014
Authors

We present 3arif1, a large-scale corpus of Modern Standard and Egyptian Arabic tweets annotated for epistemic modality2. To create 3arif, we design an interactive crowdsourcing annotation procedure that splits up the annotation process into a series of simplified questions, dispenses with the requirement for expert linguistic knowledge and captures nested modality triggers and their attributes semi- automatically.