Paper: Evaluation Metrics For Generation

ACL ID W00-1401
Title Evaluation Metrics For Generation
Venue International Conference on Natural Language Generation
Session Main Conference
Year 2000

Certain generation applications may profit from the use of stochastic methods. In developing stochastic methods, it is crucial to be able to quickly assess the relative merits of different approaches or mod- els. In this paper, we present several types of in- trinsic (system internal) metrics which we have used for baseline quantitative assessment. This quanti- tative assessment should then be augmented to a fuller evaluation that examines qualitative aspects. To this end, we describe an experiment that tests correlation between the quantitative metrics and hu- man qualitative judgment. The experiment confirms that intrinsic metrics cannot replace human evalu- ation, but some correlate significantly with human judgments of quality and understandability and can be used for evaluation during...