Paper: Revisiting Readability: A Unified Framework for Predicting Text Quality

ACL ID D08-1020
Title Revisiting Readability: A Unified Framework for Predicting Text Quality
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2008
Authors

We combine lexical, syntactic, and discourse features to produce a highly predictive model of human readers’ judgments of text readabil- ity. This is the first study to take into ac- count such a variety of linguistic factors and the first to empirically demonstrate that dis- course relations are strongly associated with the perceived quality of text. We show that various surface metrics generally expected to be related to readability are not very good pre- dictors of readability judgments in our Wall Street Journal corpus. We also establish that readability predictors behave differently de- pending on the task: predicting text readabil- ity or ranking the readability. Our experi- ments indicate that discourse relations are the one class of features that exhibits robustness across these ...