Paper: Document Classification Using A Finite Mixture Model

ACL ID P97-1006
Title Document Classification Using A Finite Mixture Model
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 1997
Authors

We propose a new method of classifying documents into categories. We define for each category a finite mixture model based on soft clustering of words. We treat the problem of classifying documents as that of conducting statistical hypothesis testing over finite mixture models, and employ the EM algorithm to efficiently estimate pa- rameters in a finite mixture model. Exper- imental results indicate that our method outperforms existing methods.