Paper: Linguistic Profiling Of Texts For The Purpose Of Language Verification

ACL ID C04-1139
Title Linguistic Profiling Of Texts For The Purpose Of Language Verification
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2004
Authors

In order to control the quality of internet-based language corpora, we developed a method to verify automatically that texts are of (near-) native quality. For the LOCNESS and ICLE corpora, the method is rather successful in separating native and non-native learner texts. The Equal Error Rate is about 10%. However, for other domains, such as internet texts, separate classifiers have to be trained on the basis of suitable seed corpora.