Paper: Phrase Clustering for Discriminative Learning

ACL ID P09-1116
Title Phrase Clustering for Discriminative Learning
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2009

We present a simple and scalable algorithm for clustering tens of millions of phrases and use the resulting clusters as features in discriminative classifiers. To demonstrate the power and generality of this approach, we apply the method in two very different applications: named entity recognition and query classification. Our results show that phrase clusters offer significant improvements over word clusters. Our NER system achieves the best current result on the widely used CoNLL benchmark. Our query classifier is on par with the best system in KDDCUP 2005 without resorting to labor intensive knowledge engineering efforts.