Paper: CSeg&Tag1.0: A Practical Word Segmenter And POS Tagger For Chinese Texts

ACL ID A97-1018
Title CSeg&Tag1.0: A Practical Word Segmenter And POS Tagger For Chinese Texts
Venue Applied Natural Language Processing Conference
Session Main Conference
Year 1997
Authors

Chinese word segmentation and POS tagging are two key techniques in many applications in Chinese information processing. Great efforts have been paid to the research in the last decade, but unfortunately, no practical system with high performance for unrestricted texts is available up to date. CSeg&Tagl.0, a Chinese word segmenter and POS tagger which unifies these two procedures into one model, is introduced in this paper. The preliminary open tests show that the segmentation precision of CSeg&Tagl.0 is about 98.0% - 99.3%, POS tagging precision about 91.0% 97.1%, and the recall and precision for unknown words are ranging from 95.0% to 99.0% and from 87.6% to 95.3% respectively. The processing speed is about 100 characters per second on Pentium 133 PC. The work of improving the performanc...