Paper: Generation that Exploits Corpus-Based Statistical Knowledge

ACL ID C98-1112
Title Generation that Exploits Corpus-Based Statistical Knowledge
Venue International Conference on Computational Linguistics
Session Main Conference
Year 1998
Authors

We describe novel aspects of a new natural lan- guage generator called Nitrogen. This generator has a highly flexible input representation that allows a spectrum of input from syntactic to semantic depth, and shifts the burden of many linguistic decisions to the statistical post-processor. The generation al- gorithm is compositional, making it efficient, yet it also handles non-compositional aspects of language. Nitrogen's design makes it robust and scalable, op- erating with lexicons and knowledge bases of one hundred thousand entities.