Paper: Paraphrasing with Search Engine Query Logs

ACL ID C10-1148
Title Paraphrasing with Search Engine Query Logs
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2010
Authors

Thispaperproposesamethodthatextracts paraphrases from search engine query logs. The method first extracts paraphrase query-title pairs based on an assumption that a search query and its correspond- ing clicked document titles may mean the same thing. It then extracts paraphrase query-query and title-title pairs from the query-title paraphrases with a pivot ap- proach. Paraphrases extracted in each step are validated with a binary classifier. We evaluate the method using a query log from Baidu1, a Chinese search engine. Experimental results show that the pro- posed method is effective, which extracts more than 3.5 million pairs of paraphrases with a precision of over 70%. The results also show that the extracted paraphrases can be used to generate high-quality para- phrase patterns.