Paper: Data Driven Grammatical Error Detection in Transcripts of Children's Speech

ACL ID D14-1106
Title Data Driven Grammatical Error Detection in Transcripts of Children's Speech
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2014
Authors

We investigate grammatical error detec- tion in spoken language, and present a data-driven method to train a dependency parser to automatically identify and label grammatical errors. This method is ag- nostic to the label set used, and the only manual annotations needed for training are grammatical error labels. We find that the proposed system is robust to disfluencies, so that a separate stage to elide disfluen- cies is not required. The proposed system outperforms two baseline systems on two different corpora that use different sets of error tags. It is able to identify utterances with grammatical errors with an F1-score as high as 0.623, as compared to a baseline F1 of 0.350 on the same data.