Paper: Unsupervised Word Usage Similarity in Social Media Texts

ACL ID S13-1036
Title Unsupervised Word Usage Similarity in Social Media Texts
Venue Joint Conference on Lexical and Computational Semantics
Session
Year 2013
Authors

We propose an unsupervised method for au- tomatically calculating word usage similar- ity in social media data based on topic mod- elling, which we contrast with a baseline dis- tributional method and Weighted Textual Ma- trix Factorization. We evaluate these meth- ods against a novel dataset made up of human ratings over 550 Twitter message pairs anno- tated for usage similarity for a set of 10 nouns. The results show that our topic modelling ap- proach outperforms the other two methods.