Paper: A Lexicon-Constrained Character Model for Chinese Morphological Analysis

ACL ID I05-1048
Title A Lexicon-Constrained Character Model for Chinese Morphological Analysis
Venue International Joint Conference on Natural Language Processing
Session Main Conference
Year 2005
Authors

This paper proposes a lexicon-constrained character model that com- bines both word and character features to solve complicated issues in Chinese morphological analysis. A Chinese character-based model constrained by a lexicon is built to acquire word building rules. Each character in a Chinese sen- tence is assigned a tag by the proposed model. The word segmentation and part- of-speech tagging results are then generated based on the character tags. The proposed method solves such problems as unknown word identification, data sparseness, and estimation bias in an integrated, unified framework. Preliminary experiments indicate that the proposed method outperforms the best SIGHAN word segmentation systems in the open track on 3 out of the 4 test corpora. Ad- ditionally, our method can...