Paper: Teaching A Weaker Classifier: Named Entity Recognition On Upper Case Text

ACL ID P02-1061
Title Teaching A Weaker Classifier: Named Entity Recognition On Upper Case Text
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2002
Authors

This paper describes how a machine- learning named entity recognizer (NER) on upper case text can be improved by us- ing a mixed case NER and some unlabeled text. The mixed case NER can be used to tag some unlabeled mixed case text, which are then used as additional training mate- rial for the upper case NER. We show that this approach reduces the performance gap between the mixed case NER and the upper case NER substantially, by 39% for MUC-6 and 22% for MUC-7 named en- tity test data. Our method is thus useful in improving the accuracy of NERs on up- per case text, such as transcribed text from automatic speech recognizers where case information is missing.