Paper: Fully Parsing The Penn Treebank

ACL ID N06-1024
Title Fully Parsing The Penn Treebank
Venue Human Language Technologies
Session Main Conference
Year 2006

We present a two stage parser that recov- ers Penn Treebank style syntactic analy- ses of new sentences including skeletal syntactic structure, and, for the first time, both function tags and empty categories. The accuracy of the first-stage parser on the standard Parseval metric matches that of the (Collins, 2003) parser on which it is based, despite the data fragmentation caused by the greatly enriched space of possible node labels. This first stage si- multaneously achieves near state-of-the- art performance on recovering function tags with minimal modifications to the un- derlying parser, modifying less than ten lines of code. The second stage achieves state-of-the-art performance on the recov- ery of empty categories by combining a linguistically-informed architecture and a rich featu...