Paper: Automatic Acquisition of Lexical Formality

ACL ID C10-2011
Title Automatic Acquisition of Lexical Formality
Venue International Conference on Computational Linguistics
Session Poster Session
Year 2010

There has been relatively little work fo- cused on determining the formality level of individual lexical items. This study applies information from large mixed- genre corpora, demonstrating that signif- icant improvement is possible over simple word-length metrics, particularly when multiple sources of information, i.e. word length, word counts, and word associ- ation, are integrated. Our best hybrid system reaches 86% accuracy on an En- glish near-synonym formality identifica- tion task, and near perfect accuracy when comparing words with extreme formality differences. We also test our word as- sociation method in Chinese, a language where word length is not an appropriate metric for formality.