Paper: Predicting Part-of-Speech Information about Unknown Words using Statistical Methods

ACL ID C98-2246
Title Predicting Part-of-Speech Information about Unknown Words using Statistical Methods
Venue International Conference on Computational Linguistics
Session Main Conference
Year 1998
Authors

This paper examines the feasibility of using sta- tistical methods to train a part-of-speech pre- dictor for unknown words. By using statistical methods, without incorporating hand-crafted linguistic information, the predictor could be used with any language for which there is a large tagged training corpus. Encouraging re- sults have been obtained by testing the predic- tor on unknown words from the Brown corpus. The relative value of information sources such as affixes and context is discussed. This part-of- speech predictor will be used in a part-of-speech tagger to handle out-of-lexicon words.