Paper: Clustering Technique in Multi-Document Personal Name Disambiguation

ACL ID P09-3011
Title Clustering Technique in Multi-Document Personal Name Disambiguation
Venue ACL-IJCNLP: Student Research Workshop papers
Session
Year 2009
Authors

Focusing on multi-document personal name disambiguation, this paper develops an agglo- merative clustering approach to resolving this problem. We start from an analysis of point- wise mutual information between feature and the ambiguous name, which brings about a novel weight computing method for feature in clustering. Then a trade-off measure between within-cluster compactness and among-cluster separation is proposed for stopping clustering. After that, we apply a labeling method to find representative feature for each cluster. Finally, experiments are conducted on word-based clustering in Chinese dataset and the result shows a good effect.