Paper: Incorporating New Words Detection with Chinese Word Segmentation

ACL ID W10-4134
Title Incorporating New Words Detection with Chinese Word Segmentation
Venue Joint Conference on Chinese Language Processing
Session Main Conference
Year 2010
Authors

With development in Chinese words segmentation, in-vocabulary word segmentation and named entity recognition achieves state-of-art performance. However, new words become bottleneck to Chinese word segmentation. This paper presents the result from Beijing Institute of Technology (BIT) in the Sixth International Chinese Word Segmentation Bakeoff in 2010. Firstly, the author reviewed the problem caused by the new words in Chinese texts, then introduced the algorithm of new words detection. The final section provided the official evaluation result in this bakeoff and gave conclusions.