Paper: Multi-Document Summarization using Sentence-based Topic Models

ACL ID P09-2075
Title Multi-Document Summarization using Sentence-based Topic Models
Venue Annual Meeting of the Association of Computational Linguistics
Session Short Paper
Year 2009

Most of the existing multi-document summarization methods decompose the documents into sentences and work directly in the sentence space using a term-sentence matrix. However, the knowledge on the document side, i.e. the topics embedded in the documents, can help the context understanding and guide the sentence selection in the summariza- tion procedure. In this paper, we propose a new Bayesian sentence-based topic model for summarization by making use of both the term-document and term-sentence associations. An efficient variational Bayesian algorithm is derived for model parameter estimation. Experimental results on benchmark data sets show the effectiveness of the proposed model for the multi-document summarization task.