Paper: Automatic Evaluation of Topic Coherence

ACL ID N10-1012
Title Automatic Evaluation of Topic Coherence
Venue Human Language Technologies
Session Main Conference
Year 2010

This paper introduces the novel task of topic coherence evaluation, whereby a set of words, as generated by a topic model, is rated for coherence or interpretability. We apply a rangeoftopicscoringmodelstotheevaluation task, drawing on WordNet, Wikipedia and the Google search engine, and existing research on lexical similarity/relatedness. In compar- ison with human scores for a set of learned topics over two distinct datasets, we show a simple co-occurrence measure based on point- wise mutual information over Wikipedia data is able to achieve results for the task at or nearing the level of inter-annotator correla- tion, and that other Wikipedia-based lexical relatedness methods also achieve strong re- sults. Google produces strong, if less consis- tent, results, while our results over Wor...