Paper: Adapting a Lexicalized-Grammar Parser to Contrasting Domains

ACL ID D08-1050
Title Adapting a Lexicalized-Grammar Parser to Contrasting Domains
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2008
Authors

Most state-of-the-art wide-coverage parsers are trained on newspaper text and suffer a loss of accuracy in other domains, making parser adaptation a pressing issue. In this paper we demonstrate that a CCG parser can be adapted to two new domains, biomedical text and questions for a QA system, by us- ing manually-annotated training data at the POS and lexical category levels only. This ap- proach achieves parser accuracy comparable to that on newspaper data without the need for annotated parse trees in the new domain. We find that retraining at the lexical category level yields a larger performance increase for questions than for biomedical text and analyze the two datasets to investigate why different domains might behave differently for parser adaptation.