Paper: Data-Driven Response Generation in Social Media

ACL ID D11-1054
Title Data-Driven Response Generation in Social Media
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2011

We present a data-driven approach to generat- ing responses to Twitter status posts, based on phrase-based Statistical Machine Translation. We find that mapping conversational stimuli onto responses is more difficult than translat- ing between languages, due to the wider range of possible responses, the larger fraction of unaligned words/phrases, and the presence of large phrase pairs whose alignment cannot be further decomposed. After addressing these challenges, we compare approaches based on SMT and Information Retrieval in a human evaluation. We show that SMT outperforms IR on this task, and its output is preferred over actual human responses in 15% of cases. As far as we are aware, this is the first work to investigate the use of phrase-based SMT to di- rectly translate a linguistic s...