Paper: Gender Attribution: Tracing Stylometric Evidence Beyond Topic and Genre

ACL ID W11-0310
Title Gender Attribution: Tracing Stylometric Evidence Beyond Topic and Genre
Venue International Conference on Computational Natural Language Learning
Session Main Conference
Year 2011
Authors

Sociolinguistic theories (e.g., Lakoff (1973)) postulate that women’s language styles differ from that of men. In this paper, we explore statistical techniques that can learn to iden- tify the gender of authors in modern English text, such as web blogs and scientific papers. Although recent work has shown the efficacy of statistical approaches to gender attribution, we conjecture that the reported performance might be overly optimistic due to non-stylistic factors such as topic bias in gender that can make the gender detection task easier. Our work is the first that consciously avoids gender bias in topics, thereby providing stronger evi- dence to gender-specific styles in language be- yond topic. In addition, our comparative study provides new insights into robustness of var- ious stylo...