Paper: Integrating Punctuation Rules and Nave Bayesian Model for Chinese Creation Title Recognition

ACL ID I05-1073
Title Integrating Punctuation Rules and Nave Bayesian Model for Chinese Creation Title Recognition
Venue International Joint Conference on Natural Language Processing
Session Main Conference
Year 2005
Authors

Creation titles, i.e. titles of literary and/or artistic works, comprise over 7% of named entities in Chinese documents. They are the fourth large sort of named entities in Chinese other than personal names, location names, and organization names. However, they are rarely mentioned and studied before. Chinese title recognition is challenging for the following reasons. There are few internal features and nearly no restrictions in the naming style of titles. Their lengths and structures are varied. The worst of all, they are generally composed of common words, so that they look like common fragments of sentences. In this paper, we integrate punctuation rules, lexicon, and naïve Bayesian models to recognize creation titles in Chinese documents. This pioneer study shows a precision...