Paper: Building Domain-Specific Taggers without Annotated (Domain) Data

ACL ID D07-1118
Title Building Domain-Specific Taggers without Annotated (Domain) Data
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2007

Part of spech taging is a fundamental component in many NLP systems. When tagers developed in one domain are used in another domain, the perforance can degrade considerably. We present a method for developing tagers for new doains without requiring POS anotated text in the ne domain. Our method involves using raw doain text and identifying related ords to form a domain specific lexicon. This lexicon provides the initial lexical probabilities for EM trainig of an HM model. We evaluate the method by aply- ing it in the Biolgy doain and show that we achieve results that are comparable ith some tagers developed for this domain.