Paper: Creating A Test Collection For Citation-Based IR Experiments

ACL ID N06-1050
Title Creating A Test Collection For Citation-Based IR Experiments
Venue Human Language Technologies
Session Main Conference
Year 2006
Authors

We present an approach to building a test collection of research papers. The ap- proach is based on the Cran eld 2 tests but uses as its vehicle a current conference; research questions and relevance judge- ments of all cited papers are elicited from conference authors. The resultant test col- lection is different from TREC’s in that it comprises scienti c articles rather than newspaper text and, thus, allows for IR experiments that include citation informa- tion. The test collection currently con- sists of 170 queries with relevance judge- ments; the document collection is the ACL Anthology. We describe properties of our queries and relevance judgements, and demonstrate the use of the test collection in an experimental setup. One potentially problematic property of our collection is tha...