Paper: Exploiting Domain Structure For Named Entity Recognition

ACL ID N06-1010
Title Exploiting Domain Structure For Named Entity Recognition
Venue Human Language Technologies
Session Main Conference
Year 2006

Named Entity Recognition (NER) is a fundamental task in text mining and nat- ural language understanding. Current ap- proaches to NER (mostly based on super- vised learning) perform well on domains similar to the training domain, but they tend to adapt poorly to slightly different domains. We present several strategies for exploiting the domain structure in the training data to learn a more robust named entity recognizer that can perform well on a new domain. First, we propose a sim- ple yet effective way to automatically rank features based on their generalizabilities across domains. We then train a classifier with strong emphasis on the most general- izable features. This emphasis is imposed by putting a rank-based prior on a logis- tic regression model. We further propose a domain-aware...