Paper: Updating a Name Tagger Using Contemporary Unlabeled Data

ACL ID P09-2089
Title Updating a Name Tagger Using Contemporary Unlabeled Data
Venue Annual Meeting of the Association of Computational Linguistics
Session Short Paper
Year 2009
Authors
  • Cristina Mota (INESC, Lisbon Portugal; New York University, New York NY; Technical University of Lisbon, Lisbon Portugal)
  • Ralph Grishman (New York University, New York NY)

FormanyNLPtasks,including nameden- tity tagging, semi-supervised learning has been proposed as a reasonable alternative to methods that require annotating large amounts of training data. In this paper, we address the problem of analyzing new data given a semi-supervised NE tagger trained on data from an earlier time pe- riod. Wewillshowthatupdating theunla- beled data issufficient to maintain quality over time, and outperforms updating the labeled data. Furthermore, we will also show that augmenting the unlabeled data with older data in most cases does not re- sult in better performance than simply us- ing a smaller amount of current unlabeled data.