Paper: The Impact of Reference Quality on Automatic MT Evaluation

ACL ID C08-2010
Title The Impact of Reference Quality on Automatic MT Evaluation
Venue International Conference on Computational Linguistics
Session Poster Session
Year 2008
Authors
  • Olivier Hamon (Evaluations and Language Resources Distribution Agency (ELDA), Paris France; University of Paris 13, Villetaneuse France)
  • Djamel Mostefa (Evaluations and Language Resources Distribution Agency (ELDA), Paris France)

Language resource quality is crucial in NLP. Many of the resources used are de- rived from data created by human beings out of an NLP context, especially regard- ing MT and reference translations. In- deed, automatic evaluations need high- quality data that allow the comparison of both automatic and human translations. The validation of these resources is widely recommended before being used. This paper describes the impact of using different-quality references on evalua- tion. Surprisingly enough, similar scores are obtained in many cases regardless of the quality. Thus, the limitations of the automatic metrics used within MT are also discussed in this regard.