Paper: Part-Of-Speech Induction From Scratch

ACL ID P93-1034
Title Part-Of-Speech Induction From Scratch
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 1993

This paper presents a method for inducing the parts of speech of a language and part- of-speech labels for individual words from a large text corpus. Vector representations for the part-of-speech of a word are formed from entries of its near lexical neighbors. A dimen- sionality reduction creates a space represent- ing the syntactic categories of unambiguous words. A neural net trained on these spa- tial representations classifies individual con- texts of occurrence of ambiguous words. The method classifies both ambiguous and unam- biguous words correctly with high accuracy.