Paper: Unsupervised Personal Name Disambiguation

ACL ID W03-0405
Title Unsupervised Personal Name Disambiguation
Venue International Conference on Computational Natural Language Learning
Session Main Conference
Year 2003

This paper presents a set of algorithms for distinguishing personal names with mul- tiple real referents in text, based on little or no supervision. The approach utilizes an unsupervised clustering technique over a rich feature space of biographic facts, which are automatically extracted via a language-independent bootstrapping pro- cess. The induced clustering of named entities are then partitioned and linked to their real referents via the automatically extracted biographic data. Performance is evaluated based on both a test set of hand- labeled multi-referent personal names and via automatically generated pseudonames.