ACL ID N07-1012
Title Information Retrieval On Empty Fields
Venue Human Language Technologies
Year 2007

We explore the problem of retrieving semi-structured documents from a real- world collection using a structured query. We formally develop Structured Rele- vance Models (SRM), a retrieval model that is based on the idea that plausible values for a given field could be inferred from the context provided by the other fields in the record. We then carry out a set of experiments using a snapshot of the National Science Digital Library (NSDL) repository, and queries that only mention fields missing from the test data. For such queries, typical field matching would re- trieve no documents at all. In contrast, the SRM approach achieves a mean average precision of over twenty percent.