Paper: The MURASAKI Project: Multilingual Natural Language Understanding

ACL ID H93-1028
Title The MURASAKI Project: Multilingual Natural Language Understanding
Venue Human Language Technologies
Session Main Conference
Year 1993
Authors

This paper describes a multilingual data extraction system under development for the Department of Defense (Do[)). The system, called Murasa.ki, processes Spanish and Japanese newspaper articles reporting AIDS disease statistics. Key to Murasaki's design is its language-independent and domain-independent architecture. The system consists of shared processing modules across the three languages it currently handles (English, Japanese, and Spanish), shared general and domain-specific knowledge bases, and separate data modules for language-specific knowledge such as grammars, lexicons, morphological data and discourse data. This data-driven architecture is crucial to the success of Murasaki as a language- independent system; extending Murasaki to additional languages can be done for the most p...