Paper: Corpus And Evaluation Measures For Multiple Document Summarization With Multiple Sources

ACL ID C04-1077
Title Corpus And Evaluation Measures For Multiple Document Summarization With Multiple Sources
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2004
Authors

In this paper, we introduce a large-scale test collec- tion for multiple document summarization, the Text Summarization Challenge 3 (TSC3) corpus. We detail the corpus construction and evaluation mea- sures. The significant feature of the corpus is that it annotates not only the important sentences in a doc- ument set, but also those among them that have the same content. Moreover, we define new evaluation metrics taking redundancy into account and discuss the effectiveness of redundancy minimization.