Paper: Enlarging Paraphrase Collections through Generalization and Instantiation

ACL ID D12-1058
Title Enlarging Paraphrase Collections through Generalization and Instantiation
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2012
Authors

This paper presents a paraphrase acquisition method that uncovers and exploits generali- ties underlying paraphrases: paraphrase pat- terns are first induced and then used to col- lect novel instances. Unlike existing methods, ours uses both bilingual parallel and monolin- gual corpora. While the former are regarded as a source of high-quality seed paraphrases, the latter are searched for paraphrases that match patterns learned from the seed paraphrases. We show how one can use monolingual cor- pora, which are far more numerous and larger than bilingual corpora, to obtain paraphrases that rival in quality those derived directly from bilingual corpora. In our experiments, the number of paraphrase pairs obtained in this way from monolingual corpora was a large multiple of the number of seed ...