Paper: Evaluations Of NLG Systems: Common Corpus And Tasks Or Common Dimensions And Metrics?

ACL ID W06-1419
Title Evaluations Of NLG Systems: Common Corpus And Tasks Or Common Dimensions And Metrics?
Venue International Conference on Natural Language Generation
Session Main Conference
Year 2006
Authors

In this position paper, we argue that a common task and corpus are not the only ways to evaluate Natural Language Gen- eration (NLG) systems. It might be, in fact, too narrow a view on evaluation and thus not be the best way to evaluate these systems. The aim of a common task and corpus is to allow for a comparative evaluation of systems, looking at the sys- tems’ performances. It is thus a “system- oriented” view of evaluation. We argue here that, if we are to take a system ori- ented view of evaluation, the community might be better served by enlarging the view of evaluation, defining common dimensions and metrics to evaluate sys- tems and approaches. We also argue that end-user (or usability) evaluations form another important aspect of a system’s evaluation and should not be fo...