Paper: Discovery of Term Variation in Japanese Web Search Queries

ACL ID D09-1154
Title Discovery of Term Variation in Japanese Web Search Queries
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2009
Authors

In this paper we address the problem of identi- fying a broad range of term variations in Japa- nese web search queries, where these varia- tions pose a particularly thorny problem due to the multiple character types employed in its writing system. Our method extends the tech- niques proposed for English spelling correc- tion of web queries to handle a wider range of term variants including spelling mistakes, va- lid alternative spellings using multiple charac- ter types, transliterations and abbreviations. The core of our method is a statistical model built on the MART algorithm (Friedman, 2001). We show that both string and semantic similarity features contribute to identifying term variation in web search queries; specifi- cally, the semantic similarity features used in our sy...