Paper: Re-examining Machine Translation Metrics for Paraphrase Identification

ACL ID N12-1019
Title Re-examining Machine Translation Metrics for Paraphrase Identification
Venue Annual Conference of the North American Chapter of the Association for Computational Linguistics
Session Main Conference
Year 2012
Authors

We propose to re-examine the hypothesis that automated metrics developed for MT evalu- ation can prove useful for paraphrase iden- tification in light of the significant work on the development of new MT metrics over the last 4 years. We show that a meta-classifier trained using nothing but recent MT metrics outperforms all previous paraphrase identifi- cation approaches on the Microsoft Research Paraphrase corpus. In addition, we apply our system to a second corpus developed for the task of plagiarism detection and obtain ex- tremely positive results. Finally, we conduct extensive error analysis and uncover the top systematic sources of error for a paraphrase identification approach relying solely on MT metrics. We release both the new dataset and the error analysis annotations for use by...