Paper: Fast Generation Of Abstracts From General Domain Text Corpora By Extracting Relevant Sentences

ACL ID C96-2166
Title Fast Generation Of Abstracts From General Domain Text Corpora By Extracting Relevant Sentences
Venue International Conference on Computational Linguistics
Session Main Conference
Year 1996
Authors

from General Domain Text Corpora by Extracting Relevant Sentences Klaus Zechner Computational Linguistics Program Department of Philosophy 135 Baker Hall Carnegie Mellon University Pittsburgh, PA 15213-3890, USA zechner@andrew, cmu. edu Abstract This paper describes a system for gen- erating text abstracts which relies on a general, purely statistical principle, i.e., on the notion of "relevance", as it is defined in terms of the combina- tion of tf*idf weights of words in a sen- tence. The system generates abstracts from newspaper articles by selecting the "most relevant" sentences and combin- ing them in text order. Since neither domain knowledge nor text-sort-specific heuristics are involved, this system pro- vides maximal generality and flexibility. Also, it is fast and can be efficie...