Paper: Using Multiple Discriminant Analysis Approach for Linear Text Segmentation

ACL ID I05-1026
Title Using Multiple Discriminant Analysis Approach for Linear Text Segmentation
Venue International Joint Conference on Natural Language Processing
Session Main Conference
Year 2005
Authors

Research on linear text segmentation has been an on-going focus in NLP for the last decade, and it has great potential for a wide range of applications such as document summarization, information retrieval and text understanding. However, for linear text segmentation, there are two critical problems involving automatic boundary detection and automatic determination of the number of segments in a document. In this paper, we propose a new domain-independent statistical model for linear text segmentation. In our model, Multiple Discriminant Analysis (MDA) criterion function is used to achieve global optimization in finding the best segmentation by means of the largest word similarity within a segment and the smallest word similarity between segments. To alleviate the high computati...