Paper: Active Learning for Word Sense Disambiguation with Methods for Addressing the Class Imbalance Problem

ACL ID D07-1082
Title Active Learning for Word Sense Disambiguation with Methods for Addressing the Class Imbalance Problem
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2007
Authors

In this paper, we analyze the effect of resampling techniques, including under- sampling and over-sampling used in active learning for word sense disambiguation (WSD). Experimental results show that un- der-sampling causes negative effects on ac- tive learning, but over-sampling is a rela- tively good choice. To alleviate the within- class imbalance problem of over-sampling, we propose a bootstrap-based over- sampling (BootOS) method that works bet- ter than ordinary over-sampling in active learning for WSD. Finally, we investigate when to stop active learning, and adopt two strategies, max-confidence and min-error, as stopping conditions for active learning. According to experimental results, we sug- gest a prediction solution by considering max-confidence as the upper bound and min-error...