Paper: Holistic Sentiment Analysis Across Languages: Multilingual Supervised Latent Dirichlet Allocation

ACL ID D10-1005
Title Holistic Sentiment Analysis Across Languages: Multilingual Supervised Latent Dirichlet Allocation
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2010
Authors

In this paper, we develop multilingual super- vised latent Dirichlet allocation (MLSLDA), a probabilistic generative model that allows insights gleaned from one language’s data to inform how the model captures properties of other languages. MLSLDA accomplishes this by jointly modeling two aspects of text: how multilingual concepts are clustered into themat- ically coherent topics and how topics associ- ated with text connect to an observed regres- sion variable (such as ratings on a sentiment scale). Concepts are represented in a general hierarchical framework that is flexible enough to express semantic ontologies, dictionaries, clustering constraints, and, as a special, degen- erate case, conventional topic models. Both the topics and the regression are discovered via posterior inferenc...