Paper: Improved Bayesian Logistic Supervised Topic Models with Data Augmentation

ACL ID P13-1019
Title Improved Bayesian Logistic Supervised Topic Models with Data Augmentation
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2013
Authors

Supervised topic models with a logistic likelihood have two issues that potential- ly limit their practical use: 1) response variables are usually over-weighted by document word counts; and 2) existing variational inference methods make strict mean-field assumptions. We address these issues by: 1) introducing a regularization constant to better balance the two parts based on an optimization formulation of Bayesian inference; and 2) developing a simple Gibbs sampling algorithm by intro- ducing auxiliary Polya-Gamma variables and collapsing out Dirichlet variables. Our augment-and-collapse sampling algorithm has analytical forms of each conditional distribution without making any restrict- ing assumptions and can be easily paral- lelized. Empirical results demonstrate sig- nificant improveme...