Paper: From Controlled Document Authoring To Interactive Document Normalization

ACL ID C04-1166
Title From Controlled Document Authoring To Interactive Document Normalization
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2004
Authors
  • Aurélien Max (Institute of Information and Applied Mathematics Grenoble, Grenoble France)

This paper presents an approach to nor- malize documents in constrained domains. This approach reuses resources developed for controlled document authoring and is decomposed into three phases. First, can- didate content representations for an input document are automatically built. Then, the content representation that best corres- ponds to the document according to an ex- pert of the class of documents is identifled. This content representation is flnally used to generate the normalized version of the docu- ment. The current version of our prototype system is presented, and its limitations are discussed. 1 Document normalization The authoring of documents in constrained domains and their translation into other lan- guages is a very important activity in industrial settings. In some cases,...