Paper: The Haves and the Have-Nots: Leveraging Unlabelled Corpora for Sentiment Analysis

ACL ID P13-1041
Title The Haves and the Have-Nots: Leveraging Unlabelled Corpora for Sentiment Analysis
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2013
Authors

Expensive feature engineering based on WordNet senses has been shown to be useful for document level sentiment classification. A plausible reason for such a performance improvement is the reduction in data sparsity. However, such a reduction could be achieved with a lesser effort through the means of syntagma based word clustering. In this paper, the problem of data sparsity in sentiment analysis, both monolingual and cross-lingual, is addressed through the means of clustering. Experiments show that cluster based data sparsity reduction leads to performance better than sense based classification for sentiment analysis at document level. Similar idea is applied to Cross Lingual Sentiment Analysis (CLSA), and it is shown that reduction in data sparsity (after translation or bilingual-mapping...