Paper: Sprinkling Topics for Weakly Supervised Text Classification

ACL ID P14-2010
Title Sprinkling Topics for Weakly Supervised Text Classification
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2014
Authors

Supervised text classification algorithms require a large number of documents la- beled by humans, that involve a labor- intensive and time consuming process. In this paper, we propose a weakly su- pervised algorithm in which supervision comes in the form of labeling of Latent Dirichlet Allocation (LDA) topics. We then use this weak supervision to ?sprin- kle? artificial words to the training docu- ments to identify topics in accordance with the underlying class structure of the cor- pus based on the higher order word asso- ciations. We evaluate this approach to im- prove performance of text classification on three real world datasets.