Paper: Iterative Annotation Transformation with Predict-Self Reestimation for Chinese Word Segmentation

ACL ID D12-1038
Title Iterative Annotation Transformation with Predict-Self Reestimation for Chinese Word Segmentation
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2012
Authors

In this paper we first describe the technol- ogy of automatic annotation transformation, which is based on the annotation adaptation algorithm (Jiang et al., 2009). It can auto- matically transform a human-annotated cor- pus from one annotation guideline to another. We then propose two optimization strategies, iterative training and predict-self reestimation, to further improve the accuracy of annota- tion guideline transformation. Experiments on Chinese word segmentation show that, the it- erative training strategy together with predict- self reestimation brings significant improve- ment over the simple annotation transforma- tion baseline, and leads to classifiers with sig- nificantly higher accuracy and several times faster processing than annotation adaptation does. On the Penn Chinese...