Paper: Chinese Parsing Exploiting Characters

ACL ID P13-1013
Title Chinese Parsing Exploiting Characters
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2013

Characters play an important role in the Chinese language, yet computational pro- cessing of Chinese has been dominated by word-based approaches, with leaves in syntax trees being words. We investigate Chinese parsing from the character-level, extending the notion of phrase-structure trees by annotating internal structures of words. We demonstrate the importance of character-level information to Chinese processing by building a joint segmen- tation, part-of-speech (POS) tagging and phrase-structure parsing system that inte- grates character-structure features. Our joint system significantly outperforms a state-of-the-art word-based baseline on the standard CTB5 test, and gives the best published results for Chinese parsing.