Paper: Developing Age and Gender Predictive Lexica over Social Media

ACL ID D14-1121
Title Developing Age and Gender Predictive Lexica over Social Media
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2014
Authors

Demographic lexica have potential for widespread use in social science, economic, and business applications. We derive predic- tive lexica (words and weights) for age and gender using regression and classification models from word usage in Facebook, blog, and Twitter data with associated demographic labels. The lexica, made publicly available, 1 achieved state-of-the-art accuracy in language based age and gender prediction over Face- book and Twitter, and were evaluated for generalization across social media genres as well as in limited message situations.