Paper: Applying automatically parsed corpora to the study of language variation

ACL ID C14-1186
Title Applying automatically parsed corpora to the study of language variation
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2014
Authors

In this work, we discuss the benefits of using automatically parsed corpora to study language variation. The study of language variation is an area of linguistics in which quantitative methods have been particu- larly successful. We argue that the large datasets that can be obtained using automatic annotation can help drive further research in this direction, providing sufficient data for the increasingly complex models used to describe variation. We demonstrate this by replicating and extending a previous quantitative variation study that used manually and semi-automatically annotated data. We show that while the study cannot be replicated completely due to limitations of the existing automatic annotation, we can draw at least the same conclusions as the original study. In addition, we de...