Paper: Hierarchical Discriminative Classification for Text-Based Geolocation

ACL ID D14-1039
Title Hierarchical Discriminative Classification for Text-Based Geolocation
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2014
Authors

Text-based document geolocation is com- monly rooted in language-based infor- mation retrieval techniques over geodesic grids. These methods ignore the natural hierarchy of cells in such grids and fall afoul of independence assumptions. We demonstrate the effectiveness of using lo- gistic regression models on a hierarchy of nodes in the grid, which improves upon the state of the art accuracy by several percent and reduces mean error distances by hundreds of kilometers on data from Twitter, Wikipedia, and Flickr. We also show that logistic regression performs fea- ture selection effectively, assigning high weights to geocentric terms.