Paper: Structured Generative Models for Unsupervised Named-Entity Clustering

ACL ID N09-1019
Title Structured Generative Models for Unsupervised Named-Entity Clustering
Venue Human Language Technologies
Session Main Conference
Year 2009
Authors

We describe a generative model for clustering named entities which also models named en- tity internal structure, clustering related words by role. The model is entirely unsupervised; it uses features from the named entity itself and its syntactic context, and coreference in- formation from an unsupervised pronoun re- solver. The model scores 86% on the MUC-7 named-entity dataset. To our knowledge, this is the best reported score for a fully unsuper- vised model, and the best score for a genera- tive model.