Paper: Open-Source Tools for Morphology, Lemmatization, POS Tagging and Named Entity Recognition

ACL ID P14-5003
Title Open-Source Tools for Morphology, Lemmatization, POS Tagging and Named Entity Recognition
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2014
Authors

We present two recently released open- source taggers: NameTag is a free soft- ware for named entity recognition (NER) which achieves state-of-the-art perfor- mance on Czech; MorphoDiTa (Morpho- logical Dictionary and Tagger) performs morphological analysis (with lemmatiza- tion), morphological generation, tagging and tokenization with state-of-the-art re- sults for Czech and a throughput around 10-200K words per second. The taggers can be trained for any language for which annotated data exist, but they are specifi- cally designed to be efficient for inflective languages, Both tools are free software under LGPL license and are distributed along with trained linguistic models which are free for non-commercial use under the CC BY-NC-SA license. The releases in- clude standalone tools, C++ l...