Paper: Unsupervised Dependency Parsing without Gold Part-of-Speech Tags

ACL ID D11-1118
Title Unsupervised Dependency Parsing without Gold Part-of-Speech Tags
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2011
Authors

We show that categories induced by unsuper- vised word clustering can surpass the perfor- mance of gold part-of-speech tags in depen- dency grammar induction. Unlike classic clus- tering algorithms, our method allows a word to have different tags in different contexts. In an ablative analysis, we first demonstrate that this context-dependence is crucial to the superior performance of gold tags — requir- ing a word to always have the same part-of- speech significantly degrades the performance of manual tags in grammar induction, elim- inating the advantage that human annotation has over unsupervised tags. We then introduce a sequence modeling technique that combines the output of a word clustering algorithm with context-colored noise, to allow words to be tagged differently in different c...