Paper: An Off-the-shelf Language Identification Tool

ACL ID P12-3005
Title An Off-the-shelf Language Identification Tool
Venue Annual Meeting of the Association of Computational Linguistics
Session System Demonstration
Year 2012

We present, an off-the-shelf lan- guage identification tool. We discuss the de- sign and implementation of, and provide an empirical comparison on 5 long- document datasets, and 2 datasets from the mi- croblog domain. We find that maintains consistently high accuracy across all domains, making it ideal for end-users that require language identification without want- ing to invest in preparation of in-domain train- ing data.