Paper: Part-of-Speech Tagging for Chinese-English Mixed Texts with Dynamic Features

ACL ID D12-1126
Title Part-of-Speech Tagging for Chinese-English Mixed Texts with Dynamic Features
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2012
Authors

In modern Chinese articles or conversations, it is very popular to involve a few English words, especially in emails and Internet liter- ature. Therefore, it becomes an important and challenging topic to analyze Chinese-English mixed texts. The underlying problem is how to tag part-of-speech (POS) for the English words involved. Due to the lack of specially annotated corpus, most of the English words are tagged as the oversimplified type, ?foreign words?. In this paper, we present a method using dynamic features to tag POS of mixed texts. Experiments show that our method achieves higher performance than traditional sequence labeling methods. Meanwhile, our method also boosts the performance of POS tagging for pure Chinese texts.