Paper: Automatic Keyphrase Extraction via Topic Decomposition

ACL ID D10-1036
Title Automatic Keyphrase Extraction via Topic Decomposition
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2010

Existing graph-based ranking methods for keyphrase extraction compute a single impor- tance score for each word via a single ran- dom walk. Motivated by the fact that both documents and words can be represented by a mixture of semantic topics, we propose to decompose traditional random walk into mul- tiple random walks specific to various topics. We thus build a Topical PageRank (TPR) on word graph to measure word importance with respect to different topics. After that, given the topic distribution of the document, we fur- ther calculate the ranking scores of words and extract the top ranked ones as keyphrases. Ex- perimental results show that TPR outperforms state-of-the-art keyphrase extraction methods on two datasets under various evaluation met- rics.