Paper: An Empirical Evaluation of Data-Driven Paraphrase Generation Techniques

ACL ID P11-2096
Title An Empirical Evaluation of Data-Driven Paraphrase Generation Techniques
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2011
Authors

Paraphrase generation is an important task that has received a great deal of interest re- cently. Proposed data-driven solutions to the problem have ranged from simple approaches that make minimal use of NLP tools to more complex approaches that rely on numerous language-dependent resources. Despite all of the attention, there have been very few direct empirical evaluations comparing the merits of the different approaches. This paper empiri- cally examines the tradeoffs between simple and sophisticated paraphrase harvesting ap- proaches to help shed light on their strengths and weaknesses. Our evaluation reveals that very simple approaches fare surprisingly well and have a number of distinct advantages, in- cluding strong precision, good coverage, and low redundancy.