Paper: Chinese Native Language Identification

ACL ID E14-4019
Title Chinese Native Language Identification
Venue Annual Meeting of The European Chapter of The Association of Computational Linguistics
Session Main Conference
Year 2014

We present the first application of Na- tive Language Identification (NLI) to non- English data. Motivated by theories of lan- guage transfer, NLI is the task of iden- tifying a writer?s native language (L1) based on their writings in a second lan- guage (the L2). An NLI system was ap- plied to Chinese learner texts using topic- independent syntactic models to assess their accuracy. We find that models using part-of-speech tags, context-free grammar production rules and function words are highly effective, achieving a maximum ac- curacy of 71% . Interestingly, we also find that when applied to equivalent English data, the model performance is almost identical. This finding suggests a sys- tematic pattern of cross-linguistic transfer may exist, where the degree of transfer is independent of...