Paper: A Trainable Rule-Based Algorithm For Word Segmentation

ACL ID P97-1041
Title A Trainable Rule-Based Algorithm For Word Segmentation
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 1997
Authors

This paper presents a trainable rule-based algorithm for performing word segmen- tation. The algorithm provides a sim- ple, language-independent alternative to large-scale lexicai-based segmenters requir- ing large amounts of knowledge engineer- ing. As a stand-alone segmenter, we show our algorithm to produce high performance Chinese segmentation. In addition, we show the transformation-based algorithm to be effective in improving the output of several existing word segmentation algo- rithms in three different languages.