Paper: On the Evaluation and Comparison of Taggers: the Effect of Noise in Testing Corpora

ACL ID P98-2164
Title On the Evaluation and Comparison of Taggers: the Effect of Noise in Testing Corpora
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 1998
Authors

This paper addresses the issue of POS tagger evaluation. Such evaluation is usually per- formed by comparing the tagger output with a reference test corpus, which is assumed to be error-free. Currently used corpora contain noise which causes the obtained performance to be a distortion of the real value. We analyze to what extent this distortion may invalidate the com- parison between taggers or the measure of the improvement given by a new system. The main conclusion is that a more rigorous testing exper- imentation setting/designing is needed to reli- ably evaluate and compare tagger accuracies.