Paper: Fast and Robust Part-of-Speech Tagging Using Dynamic Model Selection

ACL ID P12-2071
Title Fast and Robust Part-of-Speech Tagging Using Dynamic Model Selection
Venue Annual Meeting of the Association of Computational Linguistics
Session Short Paper
Year 2012
Authors

This paper presents a novel way of improv- ing POS tagging on heterogeneous data. First, two separate models are trained (generalized and domain-specific) from the same data set by controlling lexical items with different doc- ument frequencies. During decoding, one of the models is selected dynamically given the cosine similarity between each sentence and the training data. This dynamic model selec- tion approach, coupled with a one-pass, left- to-right POS tagging algorithm, is evaluated on corpora from seven different genres. Even with this simple tagging algorithm, our sys- tem shows comparable results against other state-of-the-art systems, and gives higher ac- curacies when evaluated on a mixture of the data. Furthermore, our system is able to tag about 32K tokens per second. We beli...