Paper: SITS: A Hierarchical Nonparametric Model using Speaker Identity for Topic Segmentation in Multiparty Conversations

ACL ID P12-1009
Title SITS: A Hierarchical Nonparametric Model using Speaker Identity for Topic Segmentation in Multiparty Conversations
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2012
Authors

One of the key tasks for analyzing conversa- tional data is segmenting it into coherent topic segments. However, most models of topic segmentation ignore the social aspect of con- versations, focusing only on the words used. We introduce a hierarchical Bayesian nonpara- metric model, Speaker Identity for Topic Seg- mentation (SITS), that discovers (1) the top- ics used in a conversation, (2) how these top- ics are shared across conversations, (3) when these topics shift, and (4) a person-specific tendency to introduce new topics. We eval- uate against current unsupervised segmenta- tion models to show that including person- specific information improves segmentation performance on meeting corpora and on po- litical debates. Moreover, we provide evidence that SITS captures an individual?s t...