Paper: Annotating ESL Errors: Challenges and Rewards

ACL ID W10-1004
Title Annotating ESL Errors: Challenges and Rewards
Venue Innovative Use of NLP for Building Educational Applications
Session
Year 2010
Authors

In this paper, we present a corrected and error- tagged corpus of essays written by non-native speakers of English. The corpus contains 63000 words and includes data by learners of English of nine first language backgrounds. The annotation was performed at the sentence level and involved correcting all errors in the sentence. Error classification includes mis- takes in preposition and article usage, errors in grammar, word order, and word choice. We show an analysis of errors in the annotated corpus by error categories and first language backgrounds, as well as inter-annotator agree- ment on the task. We also describe a computer program that was developed to facilitate and standardize the an- notation procedure for the task. The program allows for the annotation of various types of mistake...