Paper: Linguistic Redundancy in Twitter

ACL ID D11-1061
Title Linguistic Redundancy in Twitter
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2011
Authors

In the last few years, the interest of the re- search community in micro-blogs and social media services, such as Twitter, is growing ex- ponentially. Yet, so far not much attention has been paid on a key characteristic of micro- blogs: the high level of information redun- dancy. The aim of this paper is to systemat- ically approach this problem by providing an operational definition of redundancy. We cast redundancy in the framework of Textual En- tailment Recognition. We also provide quan- titative evidence on the pervasiveness of re- dundancy in Twitter, and describe a dataset of redundancy-annotated tweets. Finally, we present a general purpose system for identify- ing redundant tweets. An extensive quantita- tive evaluation shows that our system success- fully solves the redundancy de...