Paper: Observational Initialization of Type-Supervised Taggers

ACL ID P14-2132
Title Observational Initialization of Type-Supervised Taggers
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2014
Authors

Recent work has sparked new interest in type-supervised part-of-speech tagging, a data setting in which no labeled sen- tences are available, but the set of allowed tags is known for each word type. This paper describes observational initializa- tion, a novel technique for initializing EM when training a type-supervised HMM tagger. Our initializer allocates probabil- ity mass to unambiguous transitions in an unlabeled corpus, generating token-level observations from type-level supervision. Experimentally, observational initializa- tion gives state-of-the-art type-supervised tagging accuracy, providing an error re- duction of 56% over uniform initialization on the Penn English Treebank.