Paper: Linguistic Indeterminacy As A Source Of Errors In Tagging

ACL ID C96-2114
Title Linguistic Indeterminacy As A Source Of Errors In Tagging
Venue International Conference on Computational Linguistics
Session Main Conference
Year 1996
Authors

Most evaluations of part-of-speech tagging compare the utput of an automatic tagger to some established standard, define the differences as tagging errors and try to remedy them by, e.g., more training of the tagger. The present article is based on a manual analysis of a large number of tagging errors. Some clear patterns among the errors can be discerned, and the sources of the errors as well as possible alternative methods of remedy are presented and discussed. In particular are the problems with undecidable cases treated. 1 Background When the performance of automatic part-of-speech taggers is discussed, it is normally measured relative to some standard material, such as the Brown Corpus, or to a manual tagging or a manual proof-reading of (some smaller part of) the tagged material. The...