Paper: Corpus-Guided Sentence Generation of Natural Images

ACL ID D11-1041
Title Corpus-Guided Sentence Generation of Natural Images
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2011

We propose a sentence generation strategy that describes images by predicting the most likely nouns, verbs, scenes and prepositions that make up the core sentence structure. The input are initial noisy estimates of the objects and scenes detected in the image using state of the art trained detectors. As predicting actions from still images directly is unreliable, we use a language model trained from the English Gi- gaword corpus to obtain their estimates; to- gether with probabilities of co-located nouns, scenes and prepositions. We use these esti- mates as parameters on a HMM that models the sentence generation process, with hidden nodes as sentence components and image de- tections as the emissions. Experimental re- sults show that our strategy of combining vi- sion and language produces...