Paper: Feature Space Selection and Combination for Native Language Identification

ACL ID W13-1712
Title Feature Space Selection and Combination for Native Language Identification
Venue Innovative Use of NLP for Building Educational Applications
Session
Year 2013
Authors

We decribe the submissions made by the Na- tional Research Council Canada to the Native Language Identification (NLI) shared task. Our submissions rely on a Support Vector Ma- chine classifier, various feature spaces using a variety of lexical, spelling, and syntactic features, and on a simple model combination strategy relying on a majority vote between classifiers. Somewhat surprisingly, a clas- sifier relying on purely lexical features per- formed very well and proved difficult to out- perform significantly using various combina- tions of feature spaces. However, the com- bination of multiple predictors allowed to ex- ploit their different strengths and provided a significant boost in performance.