Paper: Midge: Generating Image Descriptions From Computer Vision Detections

ACL ID E12-1076
Title Midge: Generating Image Descriptions From Computer Vision Detections
Venue Annual Meeting of The European Chapter of The Association of Computational Linguistics
Session Main Conference
Year 2012
Authors

This paper introduces a novel generation system that composes humanlike descrip- tions of images from computer vision de- tections. By leveraging syntactically in- formed word co-occurrence statistics, the generator filters and constrains the noisy detections output from a vision system to generate syntactic trees that detail what the computer vision system sees. Results show that the generation system outper- forms state-of-the-art systems, automati- cally generating some of the most natural image descriptions to date.