Paper: Domain-Specific Image Captioning

ACL ID W14-1602
Title Domain-Specific Image Captioning
Venue International Conference on Computational Natural Language Learning
Year 2014

We present a data-driven framework for image caption generation which incorpo- rates visual and textual features with vary- ing degrees of spatial structure. We pro- pose the task of domain-specific image captioning, where many relevant visual details cannot be captured by off-the-shelf general-domain entity detectors. We ex- tract previously-written descriptions from a database and adapt them to new query images, using a joint visual and textual bag-of-words model to determine the cor- rectness of individual words. We imple- ment our model using a large, unlabeled dataset of women?s shoes images and nat- ural language descriptions (Berg et al., 2010). Using both automatic and human evaluations, we show that our caption- ing method effectively deletes inaccurate words from extracted captio...