Paper: A Classification-based Algorithm for Consistency Check of Part-of-Speech Tagging for Chinese Corpora

ACL ID I05-2001
Title A Classification-based Algorithm for Consistency Check of Part-of-Speech Tagging for Chinese Corpora
Venue International Joint Conference on Natural Language Processing
Session poster-demo-tutorial
Year 2005
Authors

Ensuring consistency of Part-of-Speech (POS) tagging plays an important role in constructing high-quality Chinese corpora. After analyzing the POS tag- ging of multi-category words in large- scale corpora, we propose a novel con- sistency check method of POS tagging in this paper. Our method builds a vector model of the context of multi- category words, and uses the CZ-NN al- gorithm to classify context vectors con- structed from POS tagging sequences and judge their consistency. The ex- perimental results indicate that the pro- posed method is feasible and effective.