Paper: Predicting Part-of Speech Information about Unknown Words using Statistical Methods

ACL ID P98-2251
Title Predicting Part-of Speech Information about Unknown Words using Statistical Methods
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 1998
Authors

This paper examines the feasibility of using sta- tistical methods to train a part-of-speech pre- dictor for unknown words. By using statistical methods, without incorporating hand-crafted linguistic information, the predictor could be used with any language for which there is a large tagged training corpus. Encouraging re- sults have been obtained by testing the predic- tor on unknown words from the Brown corpus. The relative value of information sources such as affixes and context is discussed. This part-of- speech predictor will be used in a part-of-speech tagger to handle out-of-lexicon words.