Paper: Toward A Task-Based Gold Standard For Evaluation Of NP Chunks And Technical Terms

ACL ID N03-1035
Title Toward A Task-Based Gold Standard For Evaluation Of NP Chunks And Technical Terms
Venue Human Language Technologies
Session Main Conference
Year 2003
Authors

We propose a gold standard for evaluating two types of information extraction output -- noun phrase (NP) chunks (Abney 1991; Ramshaw and Marcus 1995) and technical terms (Justeson and Katz 1995; Daille 2000; Jacquemin 2002). The gold standard is built around the notion that since different semantic and syntactic variants of terms are arguably correct, a fully satisfactory assess- ment of the quality of the output must include task-based evaluation. We conducted an experi- ment that assessed subjects’ choice of index terms in an information access task. Subjects showed significant preference for index terms that are longer, as measured by number of words, and more complex, as measured by number of prepo- sitions. These terms, which were identified by a human indexer, serve as the gold sta...