Paper: Can characters reveal your native language? A language-independent approach to native language identification

ACL ID D14-1142
Title Can characters reveal your native language? A language-independent approach to native language identification
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2014
Authors

A common approach in text mining tasks such as text categorization, authorship identification or plagiarism detection is to rely on features like words, part-of-speech tags, stems, or some other high-level lin- guistic features. In this work, an approach that uses character n-grams as features is proposed for the task of native language identification. Instead of doing standard feature selection, the proposed approach combines several string kernels using mul- tiple kernel learning. Kernel Ridge Re- gression and Kernel Discriminant Analy- sis are independently used in the learning stage. The empirical results obtained in all the experiments conducted in this work in- dicate that the proposed approach achieves state of the art performance in native lan- guage identification, reaching an acc...