Paper: Query Translation By Text Categorization

ACL ID C04-1099
Title Query Translation By Text Categorization
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2004
  • Patrick Ruch (University of Geneva, Geneva Switzerland; Ecole Polytechnique Federale de Lausanne, Lausanne Switzerland)

We report on the development of a cross lan- guage information retrieval system, which translates user queries by categorizing these queries into terms listed in a controlled vo- cabulary. Unlike usual automatic text cat- egorization systems, which rely on data- intensive models induced from large train- ing data, our automatic text categorization tool applies data-independent classiflers: a vector-space engine and a pattern matcher are combined to improve ranking of Med- ical Subject Headings (MeSH). The cate- gorizer also beneflts from the availability of large thesauri, where variants of MeSH terms can be found. For evaluation, we use an English collection of MedLine records: OHSUMED. French OHSUMED queries - translated from the original English queries by domain experts- are mapped int...