Paper: Soft-Supervised Learning for Text Classification

ACL ID D08-1114
Title Soft-Supervised Learning for Text Classification
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2008

We propose a new graph-based semi- supervised learning (SSL) algorithm and demonstrate its application to document categorization. Each document is represented by a vertex within a weighted undirected graph and our proposed framework minimizes the weighted Kullback-Leibler divergence between distributions that encode the class membership probabilities of each vertex. The proposed objective is convex with guaranteed convergence using an alternating minimiza- tion procedure. Further, it generalizes in a straightforward manner to multi-class problems. We present results on two stan- dard tasks, namely Reuters-21578 and WebKB, showing that the proposed algorithm significantly outperforms the state-of-the-art.