Paper: Sarcasm Detection on Czech and English Twitter

ACL ID C14-1022
Title Sarcasm Detection on Czech and English Twitter
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2014

This paper presents a machine learning approach to sarcasm detection on Twitter in two lan- guages ? English and Czech. Although there has been some research in sarcasm detection in languages other than English (e.g., Dutch, Italian, and Brazilian Portuguese), our work is the first attempt at sarcasm detection in the Czech language. We created a large Czech Twitter cor- pus consisting of 7,000 manually-labeled tweets and provide it to the community. We evaluate two classifiers with various combinations of features on both the Czech and English datasets. Furthermore, we tackle the issues of rich Czech morphology by examining different preprocess- ing techniques. Experiments show that our language-independent approach significantly outper- forms adapted state-of-the-art methods in English (F...