Paper: Mapping Dialectal Variation by Querying Social Media

ACL ID E14-1011
Title Mapping Dialectal Variation by Querying Social Media
Venue Annual Meeting of The European Chapter of The Association of Computational Linguistics
Session Main Conference
Year 2014

We propose a Bayesian method of esti- mating a conditional distribution of data given metadata (e.g., the usage of a di- alectal variant given a location) based on queries from a big data/social me- dia source, such as Twitter. This distri- bution is structurally equivalent to those built from traditional experimental meth- ods, despite lacking negative examples. Tests using Twitter to investigate the ge- ographic distribution of dialectal forms show that this method can provide distri- butions that are tightly correlated with ex- isting gold-standard studies at a fraction of the time, cost, and effort.