Paper: Muli-label Text Categorization with Hidden Components

ACL ID D14-1193
Title Muli-label Text Categorization with Hidden Components
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2014
Authors

Multi-label text categorization (MTC) is supervised learning, where a documen- t may be assigned with multiple categories (labels) simultaneously. The labels in the MTC are correlated and the correlation re- sults in some hidden components, which represent the ?share? variance of correlat- ed labels. In this paper, we propose a method with hidden components for MTC. The proposed method employs PCA to capture the hidden components, and incor- porates them into a joint learning frame- work to improve the performance. Experi- ments with real-world data sets and evalu- ation metrics validate the effectiveness of the proposed method.