Paper: Neural Networks Leverage Corpus-wide Information for Part-of-speech Tagging

ACL ID D14-1101
Title Neural Networks Leverage Corpus-wide Information for Part-of-speech Tagging
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2014
Authors

We propose a neural network approach to benefit from the non-linearity of corpus- wide statistics for part-of-speech (POS) tagging. We investigated several types of corpus-wide information for the words, such as word embeddings and POS tag dis- tributions. Since these statistics are en- coded as dense continuous features, it is not trivial to combine these features com- paring with sparse discrete features. Our tagger is designed as a combination of a linear model for discrete features and a feed-forward neural network that cap- tures the non-linear interactions among the continuous features. By using several re- cent advances in the activation functions for neural networks, the proposed method marks new state-of-the-art accuracies for English POS tagging tasks.