Paper: High-Precision Identification Of Discourse New And Unique Noun Phrases

ACL ID P03-2012
Title High-Precision Identification Of Discourse New And Unique Noun Phrases
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2003
Authors

Coreference resolution systems usually at- tempt to find a suitable antecedent for (al- most) every noun phrase. Recent studies, however, show that many definite NPs are not anaphoric. The same claim, obviously, holds for the indefinites as well. In this study we try to learn automatically two classifications, a0a2a1a4a3a6a5a8a7a10a9a12a11a14a13a4a5a12a15 a16a17a15a19a18 and a0a2a11a14a16a20a3a22a21a12a11a23a15, relevant for this problem. We use a small training corpus (MUC-7), but also acquire some data from the Internet. Combining our classifiers sequentially, we achieve 88.9% precision and 84.6% recall for discourse new entities. We expect our classifiers to provide a good prefiltering for coreference resolution sys- tems, improving both their speed and per- formance.