Paper: Toward Completeness in Concept Extraction and Classification

ACL ID D09-1099
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2009

Many algorithms extract terms from text to- gether with some kind of taxonomic clas- sification (is-a) link. However, the general approaches used today, and specifically the methods of evaluating results, exhibit serious shortcomings. Harvesting without focusing on a specific conceptual area may deliver large numbers of terms, but they are scattered over an immense concept space, making Recall judgments impossible. Regarding Precision, simply judging the correctness of terms and their individual classification links may pro- vide high scores, but this doesn’t help with the eventual assembly of terms into a single coher- ent taxonomy. Furthermore, since there is no correct and complete gold standard to measure against, most work invents some ad hoc evalu- ation measure. We present an algo...