Paper: Corpus-Based Annotated Test Set For Machine Translation Evaluation By An Industrial User

ACL ID C96-2188
Title Corpus-Based Annotated Test Set For Machine Translation Evaluation By An Industrial User
Venue International Conference on Computational Linguistics
Session Main Conference
Year 1996
Authors

This article is concerned with the building of a test data set for assisting the industrial user in machine translation evaluation. The emphasis is laid on the interest of an approach based on the study of bilingual corpus pragmatic characteristics. The study of one chapter of the maintenance manual of the Super Puma helicopter made it possible to identify the pragmatic characteristics relevant in the choice of the morpho-syntactic structures and translation processes actually used. The textual test set consists in a SGML file including the source text sequences aligned with the reference translation sequences and also including the pragmatic, formal and translational characteristics in the form of annotations (labels and formal descriptions).