Paper: Unsupervised Decomposition of a Document into Authorial Components

ACL ID P11-1136
Title Unsupervised Decomposition of a Document into Authorial Components
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2011
Authors

We propose a novel unsupervised method for separating out distinct authorial compo- nents of a document. In particular, we show that, given a book artificially “munged” from two thematically similar biblical books, we can separate out the two consti- tuent books almost perfectly. This allows us to automatically recapitulate many con- clusions reached by Bible scholars over centuries of research. One of the key ele- ments of our method is exploitation of dif- ferences in synonym choice by different authors.