Paper: Word Representations: A Simple and General Method for Semi-Supervised Learning

ACL ID P10-1040
Title Word Representations: A Simple and General Method for Semi-Supervised Learning
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2010
Authors

If we take an existing supervised NLP sys- tem, a simple and general way to improve accuracy is to use unsupervised word representations as extra word features. We evaluate Brown clusters, Collobert and Weston (2008) embeddings, and HLBL (Mnih & Hinton, 2009) embeddings of words on both NER and chunking. We use near state-of-the-art supervised baselines, and find that each of the three word representations improves the accu- racy of these baselines. We find further improvements by combining different word representations. You can download our word features, for off-the-shelf use in existing NLP systems, as well as our code, here: http://metaoptimize. com/projects/wordreprs/