Paper: Efficient Optimization of an MDL-Inspired Objective Function for Unsupervised Part-Of-Speech Tagging

ACL ID P10-2039
Title Efficient Optimization of an MDL-Inspired Objective Function for Unsupervised Part-Of-Speech Tagging
Venue Annual Meeting of the Association of Computational Linguistics
Session Short Paper
Year 2010
Authors

The Minimum Description Length (MDL) principle is a method for model selection that trades off between the explanation of the data by the model and the complexity of the model itself. Inspired by the MDL principle, we develop an objective func- tion for generative models that captures the description of the data by the model (log-likelihood) and the description of the model (model size). We also develop a ef- ficient general search algorithm based on the MAP-EM framework to optimize this function. Since recent work has shown that minimizing the model size in a Hidden Markov Model for part-of-speech (POS) tagging leads to higher accuracies, we test our approach by applying it to this prob- lem. The search algorithm involves a sim- ple change to EM and achieves high POS tagging accuracies on...