Paper: Exploring Demographic Language Variations to Improve Multilingual Sentiment Analysis in Social Media

ACL ID D13-1187
Title Exploring Demographic Language Variations to Improve Multilingual Sentiment Analysis in Social Media
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2013
Authors

Different demographics, e.g., gender or age, can demonstrate substantial variation in their language use, particularly in informal contexts such as social media. In this paper we focus on learning gender differences in the use of sub- jective language in English, Spanish, and Rus- sian Twitter data, and explore cross-cultural differences in emoticon and hashtag use for male and female users. We show that gen- der differences in subjective language can ef- fectively be used to improve sentiment anal- ysis, and in particular, polarity classification for Spanish and Russian. Our results show statistically significant relative F-measure im- provement over the gender-independent base- line 1.5% and 1% for Russian, 2% and 0.5% for Spanish, and 2.5% and 5% for English for polarity and subjectivit...