Paper: Semantic Language Models For Topic Detection And Tracking

ACL ID N03-3001
Title Semantic Language Models For Topic Detection And Tracking
Venue Human Language Technologies
Session Student Session
Year 2003
Authors

In this work, we present a new semantic lan- guage modeling approach to model news sto- ries in the Topic Detection and Tracking (TDT) task. In the new approach, we build a unigram language model for each semantic class in a news story. We also cast the link detection sub- task of TDT as a two-class classi cation prob- lem in which the features of each sample con- sist of the generative log-likelihood ratios from each semantic class. We then compute a lin- ear discriminant classi er using the perceptron learning algorithm on the training set. Results on the test set show a marginal improvement over the unigram performance, but are not very encouraging on the whole.