Paper: A Novel Content Enriching Model for Microblog Using News Corpus

ACL ID P14-2036
Title A Novel Content Enriching Model for Microblog Using News Corpus
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2014
Authors

In this paper, we propose a novel model for enriching the content of microblogs by ex- ploiting external knowledge, thus improv- ing the data sparseness problem in short text classification. We assume that mi- croblogs share the same topics with ex- ternal knowledge. We first build an opti- mization model to infer the topics of mi- croblogs by employing the topic-word dis- tribution of the external knowledge. Then the content of microblogs is further en- riched by relevant words from external knowledge. Experiments on microblog classification show that our approach is effective and outperforms traditional text classification methods.