Paper: A Probabilistic Rasch Analysis Of Question Answering Evaluations

ACL ID N04-1009
Title A Probabilistic Rasch Analysis Of Question Answering Evaluations
Venue Human Language Technologies
Session Main Conference
Year 2004
Authors

The field of Psychometrics routinely grapples with the question of what it means to measure the inherent ability of an organism to perform a given task, and for the last forty years, the field has increasingly relied on probabilistic methods such as the Rasch model for test con- struction and the analysis of test results. Be- cause the underlying issues of measuring ability apply to human language technologies as well, such probabilistic methods can be ad- vantageously applied to the evaluation of those technologies. To test this claim, Rasch measurement was applied to the results of 67 systems participating in the Question Answer- ing track of the 2002 Text REtrieval Confer- ence (TREC) competition. Satisfactory model fit was obtained, and the paper illustrates the theoretical and practic...