Paper: Joint Bootstrapping of Corpus Annotations and Entity Types

ACL ID D13-1042
Title Joint Bootstrapping of Corpus Annotations and Entity Types
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2013

Web search can be enhanced in powerful ways if to- ken spans in Web text are annotated with disambiguated entities from large catalogs like Freebase. Entity anno- tators need to be trained on sample mention snippets. Wikipedia entities and annotated pages offer high-quality labeled data for training and evaluation. Unfortunately, Wikipedia features only one-ninth the number of enti- ties as Freebase, and these are a highly biased sample of well-connected, frequently mentioned ?head? entities. To bring hope to ?tail? entities, we broaden our goal to a second task: assigning types to entities in Freebase but not Wikipedia. The two tasks are synergistic: know- ing the types of unfamiliar entities helps disambiguate mentions, and words in mention contexts help assign types to entities. We pres...