Paper: Linguistic Structured Sparsity in Text Categorization

ACL ID P14-1074
Title Linguistic Structured Sparsity in Text Categorization
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2014
Authors

We introduce three linguistically moti- vated structured regularizers based on parse trees, topics, and hierarchical word clusters for text categorization. These regularizers impose linguistic bias in fea- ture weights, enabling us to incorporate prior knowledge into conventional bag- of-words models. We show that our structured regularizers consistently im- prove classification accuracies compared to standard regularizers that penalize fea- tures in isolation (such as lasso, ridge, and elastic net regularizers) on a range of datasets for various text prediction prob- lems: topic classification, sentiment anal- ysis, and forecasting.