Paper: A Syntax-Based Part-Of-Speech Analyser

ACL ID E95-1022
Title A Syntax-Based Part-Of-Speech Analyser
Venue Annual Meeting of The European Chapter of The Association of Computational Linguistics
Session Main Conference
Year 1995

There are two main methodologies for constructing the knowledge base of a natural language analyser: the linguis- tic and the data-driven. Recent state-of- the-art part-of-speech taggers are based on the data-driven approach. Because of the known feasibility of the linguis- tic rule-based approach at related levels of description, the success of the data- driven approach in part-of-speech analy- sis may appear surprising. In this paper, a case is made for the syntactic nature of part-of-speech tagging. A new tagger of English that uses only linguistic dis- tributional rules is outlined and empiri- cally evaluated. Tested against a bench- mark corpus of 38,000 words of previ- ously unseen text, this syntax-based sys- tem reaches an accuracy of above 99%. Compared to the 95-97% accuracy of i...