Paper: Obfuscating Document Stylometry To Preserve Author Anonymity

ACL ID P06-2058
Title Obfuscating Document Stylometry To Preserve Author Anonymity
Venue Annual Meeting of the Association of Computational Linguistics
Session Poster Session
Year 2006

This paper explores techniques for reduc- ing the effectiveness of standard author- ship attribution techniques so that an au- thor A can preserve anonymity for a par- ticular document D. We discuss feature selection and adjustment and show how this information can be fed back to the author to create a new document D’ for which the calculated attribution moves away from A. Since it can be labor inten- sive to adjust the document in this fash- ion, we attempt to quantify the amount of effort required to produce the ano- nymized document and introduce two levels of anonymization: shallow and deep. In our test set, we show that shal- low anonymization can be achieved by making 14 changes per 1000 words to reduce the likelihood of identifying A as the author by an average of more than 83%. F...