Paper: Distributional Representations for Handling Sparsity in Supervised Sequence-Labeling

ACL ID P09-1056
Title Distributional Representations for Handling Sparsity in Supervised Sequence-Labeling
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2009
Authors

Supervised sequence-labeling systems in natural language processing often suffer from data sparsity because they use word types as features in their prediction tasks. Consequently, they have difficulty estimat- ing parameters for types which appear in the test set, but seldom (or never) ap- pear in the training set. We demonstrate that distributional representations of word types, trained on unannotated text, can be used to improve performance on rare words. We incorporate aspects of these representations into the feature space of our sequence-labeling systems. In an ex- periment on a standard chunking dataset, our best technique improves a chunker from 0.76 F1 to 0.86 F1 on chunks begin- ning with rare words. On the same dataset, it improves our part-of-speech tagger from 74% to 80% accur...