Paper: A Data Driven Approach for Person Name Disambiguation in Web Search Results

ACL ID C14-1030
Title A Data Driven Approach for Person Name Disambiguation in Web Search Results
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2014
Authors

This paper presents an unsupervised approach for the task of clustering the results of a search engine when the query is a person name shared by different individuals. We propose an algo- rithm that calculates the number of clusters and establishes the groups of web pages according to the different individuals without the need to any training data or predefined thresholds, as the successful state of the art systems do. In addition, most of those systems do not deal with social media web pages and their performance could fail in a real scenario. In this paper we also propose a heuristic method for the treatment of social networking profiles. Our approach is compared with four gold standard collections for this task obtaining really competitive results, comparable to those obtained by some a...