ACL Anthology Network (All About NLP) (beta) The Association Of Computational Linguistics Anthology Network |
ACL ID | P03-2039 |
---|---|
Title | Chinese Unknown Word Identification Using Character-Based Tagging And Chunking |
Venue | Annual Meeting of the Association of Computational Linguistics |
Session | System Demonstration |
Year | 2003 |
Authors |
|
Since written Chinese has no space to de- limit words, segmenting Chinese texts be- comes an essential task. During this task, the problem of unknown word occurs. It is impossible to register all words in a dictio- nary as new words can always be created by combining characters. We propose a unified solution to detect unknown words in Chinese texts. First, a morphological analysis is done to obtain initial segmen- tation and POS tags and then a chunker is used to detect unknown words.