Paper: A Comparison of Techniques to Automatically Identify Complex Words.

ACL ID P13-3015
Title A Comparison of Techniques to Automatically Identify Complex Words.
Venue Annual Meeting of the Association of Computational Linguistics
Session Student Session
Year 2013
Authors

Identifying complex words (CWs) is an important, yet often overlooked, task within lexical simplification (The process of automatically replacing CWs with sim- pler alternatives). If too many words are identified then substitutions may be made erroneously, leading to a loss of mean- ing. If too few words are identified then those which impede a user?s understand- ing may be missed, resulting in a com- plex final text. This paper addresses the task of evaluating different methods for CW identification. A corpus of sentences with annotated CWs is mined from Sim- ple Wikipedia edit histories, which is then used as the basis for several experiments. Firstly, the corpus design is explained and the results of the validation experiments using human judges are reported. Exper- iments are carried o...