Paper: Improving Language Model Size Reduction Using Better Pruning Criteria

ACL ID P02-1023
Title Improving Language Model Size Reduction Using Better Pruning Criteria
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2002
Authors

Reducing language model (LM) size is a critical issue when applying a LM to realistic applications which have memory constraints. In this paper, three measures are studied for the purpose of LM pruning. They are probability, rank, and entropy. We evaluated the performance of the three pruning criteria in a real application of Chinese text input in terms of character error rate (CER). We first present an empirical comparison, showing that rank performs the best in most cases. We also show that the high-performance of rank lies in its strong correlation with error rate. We then present a novel method of combining two criteria in model pruning. Experimental results show that the combined criterion consistently leads to smaller models than the models pruned using either of the criteria separat...