ACL Anthology Network (All About NLP) (beta) The Association Of Computational Linguistics Anthology Network |
ACL ID | D07-1074 |
---|---|
Title | Large-Scale Named Entity Disambiguation Based on Wikipedia Data |
Venue | Conference on Empirical Methods in Natural Language Processing |
Session | Main Conference |
Year | 2007 |
Authors |
|
This paper presents a large-scale system for the recognition and semantic disambiguation of named entities based on information extracted from a large encyclopedic collection and Web search results. It describes in detail the disam- biguation paradigm employed and the information extraction process from Wikipedia. Through a process of maximizing the agreement between the contextual information extracted from Wikipedia and the context of a document, as well as the agreement among the category tags associated with the candidate entities, the implemented sys- tem shows high disambiguation accuracy on both news stories and Wikipedia articles.