Paper: Paraphrasing With Bilingual Parallel Corpora

ACL ID P05-1074
Title Paraphrasing With Bilingual Parallel Corpora
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2005

Previous work has used monolingual par- allel corpora to extract and generate para- phrases. We show that this task can be done using bilingual parallel corpora, a much more commonly available resource. Using alignment techniques from phrase- based statistical machine translation, we show how paraphrases in one language can be identified using a phrase in another language as a pivot. We define a para- phrase probability that allows paraphrases extracted from a bilingual parallel corpus to be ranked using translation probabili- ties, and show how it can be refined to take contextual information into account. We evaluate our paraphrase extraction and ranking methods using a set of manual word alignments, and contrast the qual- ity with paraphrases extracted from auto- matic alignments.