4 Experimental results We evaluated the baseline and the machine learning configurations of section 3 on the definition questions of TREC-9 (2000) and TREC2001, the same data used by Prager et al. For each question, the TREC organizers provide the 50 most highly ranked documents that an IR engine returned from the TREC documents. In our own earlier work, we developed a specialized mechanism called Virtual Annotation for handling definition questions (e.g. , Who was Galileo? and What are antibiotics?) that consults, in addition to the standard reference corpus, a structured knowledge source (WordNet) for answering such questions (Prager et al. , 2001). This is generally a very good strategy, that has been exploited successfully in a number of automated QA systems that appeared in recent years, especially in the context of TREC QA 1 evaluations (Harabagiu et al. , 2000; Hovy et al. , 2000; Prager at al. , 2001). Following Prager et al. , we count a snippet as containing an acceptable definition, if it satisfies the Perl answer patterns that the TREC organizers provide for the corresponding question (Voorhees, 2001). The definition-Q agent targets definition questions (e.g. , What is penicillin? and Who is Picasso?) with a technique called Virtual Annotation using the external knowledge source WordNet (Prager et al. , 2001).