Paper: Exploiting Structure for Event Discovery Using the MDI Algorithm

ACL ID P07-3006
Title Exploiting Structure for Event Discovery Using the MDI Algorithm
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2007
Authors

Effectively identifying events in unstruc- tured text is a very difficult task. This is largely due to the fact that an individual event can be expressed by several sentences. In this paper, we investigate the use of clus- tering methods for the task of grouping the text spans in a news article that refer to the same event. The key idea is to cluster the sentences, using a novel distance metric that exploits regularities in the sequential struc- ture of events within a document. When this approach is compared to a simple bag of words baseline, a statistically significant increase in performance is observed.