Paper: QuestionBank: Creating A Corpus Of Parse-Annotated Questions

ACL ID P06-1063
Title QuestionBank: Creating A Corpus Of Parse-Annotated Questions
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2006

This paper describes the development of QuestionBank, a corpus of 4000 parse- annotated questions for (i) use in training parsers employed in QA, and (ii) evalua- tion of question parsing. We present a se- ries of experiments to investigate the ef- fectiveness of QuestionBank as both an exclusive and supplementary training re- source for a state-of-the-art parser in pars- ing both question and non-question test sets. We introduce a new method for recovering empty nodes and their an- tecedents (capturing long distance depen- dencies) from parser output in CFG trees using LFG f-structure reentrancies. Our main ndings are (i) using QuestionBank training data improves parser performance to 89.75% labelled bracketing f-score, an increase of almost 11% over the base- line; (ii) back-testing expe...