Paper: Image Description using Visual Dependency Representations

ACL ID D13-1128
Title Image Description using Visual Dependency Representations
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2013

Describing the main event of an image in- volves identifying the objects depicted and predicting the relationships between them. Previous approaches have represented images as unstructured bags of regions, which makes it difficult to accurately predict meaningful relationships between regions. In this pa- per, we introduce visual dependency represen- tations to capture the relationships between the objects in an image, and hypothesize that this representation can improve image de- scription. We test this hypothesis using a new data set of region-annotated images, as- sociated with visual dependency representa- tions and gold-standard descriptions. We de- scribe two template-based description gener- ation models that operate over visual depen- dency representations. In an image descrip- tio...