Paper: Using N-Grams To Understand The Nature Of Summaries

ACL ID N04-4001
Title Using N-Grams To Understand The Nature Of Summaries
Venue Human Language Technologies
Session Short Paper
Year 2004

Although single-document summarization is a well-studied task, the nature of multi- document summarization is only beginning to be studied in detail. While close attention has been paid to what technologies are necessary when moving from single to multi-document summarization, the properties of human- written multi-document summaries have not been quantified. In this paper, we empirically characterize human-written summaries provided in a widely used summarization corpus by attempting to answer the questions: Can multi-document summaries that are written by humans be characterized as extractive or generative? Are multi-document summaries less extractive than single- document summaries? Our results suggest that extraction-based techniques which have been successful for single-document summa...