Paper: Word Lattice Reranking for Chinese Word Segmentation and Part-of-Speech Tagging

ACL ID C08-1049
Title Word Lattice Reranking for Chinese Word Segmentation and Part-of-Speech Tagging
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2008
Authors

In this paper, we describe a new rerank- ing strategy named word lattice reranking, for the task of joint Chinese word segmen- tation and part-of-speech (POS) tagging. As a derivation of the forest reranking for parsing (Huang, 2008), this strategy reranks on the pruned word lattice, which potentially contains much more candidates while using less storage, compared with the traditional n-best list reranking. With a perceptron classifier trained with local fea- tures as the baseline, word lattice rerank- ing performs reranking with non-local fea- tures that can’t be easily incorporated into the perceptron baseline. Experimental re- sults show that, this strategy achieves im- provement on both segmentation and POS tagging, above the perceptron baseline and the n-best list reranking.