Paper: DKIE: Open Source Information Extraction for Danish

ACL ID E14-2016
Title DKIE: Open Source Information Extraction for Danish
Venue Annual Meeting of The European Chapter of The Association of Computational Linguistics
Session Main Conference
Year 2014

Danish is a major Scandinavian language spoken daily by around six million peo- ple. However, it lacks a unified, open set of NLP tools. This demonstration will in- troduce DKIE, an extensible open-source toolkit for processing Danish text. We im- plement an information extraction archi- tecture for Danish within GATE, including integrated third-party tools. This imple- mentation includes the creation of a sub- stantial set of corpus annotations for data- intensive named entity recognition. The final application and dataset is made are openly available, and the part-of-speech tagger and NER model also operate in- dependently or with the Stanford NLP toolkit.