Paper: See No Evil, Say No Evil: Description Generation from Densely Labeled Images

ACL ID S14-1015
Title See No Evil, Say No Evil: Description Generation from Densely Labeled Images
Venue Joint Conference on Lexical and Computational Semantics
Session
Year 2014
Authors

This paper studies generation of descrip- tive sentences from densely annotated im- ages. Previous work studied generation from automatically detected visual infor- mation but produced a limited class of sen- tences, hindered by currently unreliable recognition of activities and attributes. In- stead, we collect human annotations of ob- jects, parts, attributes and activities in im- ages. These annotations allow us to build a significantly more comprehensive model of language generation and allow us to study what visual information is required to generate human-like descriptions. Ex- periments demonstrate high quality output and that activity annotations and relative spatial location of objects contribute most to producing high quality sentences.