Paper: The People's Web meets Linguistic Knowledge: Automatic Sense Alignment of Wikipedia and WordNet

ACL ID W11-0122
Title The People's Web meets Linguistic Knowledge: Automatic Sense Alignment of Wikipedia and WordNet
Venue IWCS
Session Main Conference
Year 2011
Authors

WeproposeamethodtoautomaticallyalignWordNetsynsetsandWikipediaarticlestoobtainasense inventoryofhighercoverageandquality. ForeachWordNetsynset, wefirstextractasetofWikipedia articles as alignment candidates; in a second step, we determine which article (if any) is a valid alignment, i.e. is about the same sense or concept. In this paper, we go significantly beyond state- of-the-art word overlap approaches, and apply a threshold-based Personalized PageRank method for the disambiguation step. We show that WordNet synsets can be aligned to Wikipedia articles with a performance of up to 0.78 F1-Measure based on a comprehensive, well-balanced reference dataset consisting of 1,815 manually annotated sense alignment candidates. The fully-aligned resource as well as the reference dataset is public...