Paper: Domain-Independent Captioning of Domain-Specific Images

ACL ID N13-2010
Title Domain-Independent Captioning of Domain-Specific Images
Venue Annual Conference of the North American Chapter of the Association for Computational Linguistics
Session Student Session
Year 2013

Automatically describing visual content is an extremely difficult task, with hard AI prob- lems in Computer Vision (CV) and Natural Language Processing (NLP) at its core. Pre- vious work relies on supervised visual recog- nition systems to determine the content of im- ages. These systems require massive amounts of hand-labeled data for training, so the num- ber of visual classes that can be recognized is typically very small. We argue that these ap- proaches place unrealistic limits on the kinds of images that can be captioned, and are un- likely to produce captions which reflect hu- man interpretations. We present a framework for image caption generation that does not rely on visual recog- nition systems, which we have implemented on a dataset of online shopping images and product descrip...