Paper: A Dataset for Research on Short-Text Conversations

ACL ID D13-1096
Title A Dataset for Research on Short-Text Conversations
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2013

Natural language conversation is widely re- garded as a highly difficult problem, which is usually attacked with either rule-based or learning-based models. In this paper we propose a retrieval-based automatic response model for short-text conversation, to exploit the vast amount of short conversation in- stances available on social media. For this purpose we introduce a dataset of short-text conversation based on the real-world instances from Sina Weibo (a popular Chinese mi- croblog service), which will be soon released to public. This dataset provides rich collec- tion of instances for the research on finding natural and relevant short responses to a given short text, and useful for both training and test- ing of conversation models. This dataset con- sists of both naturally formed conv...