Paper: Towards A More Careful Evaluation Of Broad Coverage Parsing Systems

ACL ID C96-1095
Title Towards A More Careful Evaluation Of Broad Coverage Parsing Systems
Venue International Conference on Computational Linguistics
Session Main Conference
Year 1996
Authors

Since treebanks have become available to researchers a wide variety of tech- niques has been used to make broad cov- erage parsing systems. This makes quan- titative evaluation very important, but the current evaluation methods have a number of drawbacks such as arbitrary choices in the treebank and the difficulty in measuring statistical significance. We suggest a more detailed method for test- ing a parsing system using constituent boundaries, with a number of measures that give more information than current measures, and evaluate the quality of the test. We also show that statistical signif- icance cannot be calculated in a straight- forward way, and suggest a calculation method for the case of Bracket Recall.