Paper: Automatic Refinement Of A POS Tagger Using A Reliable Parser And Plain Text Corpora

ACL ID C00-1046
Title Automatic Refinement Of A POS Tagger Using A Reliable Parser And Plain Text Corpora
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2000
Authors

This paper proposes a new unsupervised learning method for obtaining English part-of- specch(POS) disambiguation rules which would improve thc accuracy of a POS tagger. This method has been implemented in the experimental system APRAS (Automatic POS Rule Acquisition System), which extracts POS disambiguation rules fl'om plain text corpora by utilizing different types of coded linguistic knowledge, i.e., POS tagging rules and syntactic parsing rules, which arc already stored in a fully implemented MT system. In our ext)eriment, the obtained rules were applied to 1.7% of the sentences in a non-training corpus. For this group of sentences, 78.4% of the changes made in tagging results were an improvement. We also saw a 15.5 % improvement in tagging and parsing speed and an 8.0 % increase of pa...