Paper: Streaming Analysis of Discourse Participants

ACL ID D12-1005
Title Streaming Analysis of Discourse Participants
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2012

Inferring attributes of discourse participants has been treated as a batch-processing task: data such as all tweets from a given author are gathered in bulk, processed, analyzed for a particular feature, then reported as a result of academic interest. Given the sources and scale of material used in these efforts, along with potential use cases of such analytic tools, discourse analysis should be reconsidered as a streaming challenge. We show that un- der certain common formulations, the batch- processing analytic framework can be decom- posed into a sequential series of updates, us- ing as an example the task of gender classifi- cation. Once in a streaming framework, and motivated by large data sets generated by so- cial media services, we present novel results in approximate counting, sho...