Paper: Automated Suggestions for Miscollocations

ACL ID W09-2107
Title Automated Suggestions for Miscollocations
Venue Innovative Use of NLP for Building Educational Applications
Year 2009

One of the most common and persistent error types in second language writing is colloca- tion errors, such as learn knowledge instead of gain or acquire knowledge, or make damage rather than cause damage. In this work-in- progress report, we propose a probabilistic model for suggesting corrections to lexical collocation errors. The probabilistic model in- corporates three features: word association strength (MI), semantic similarity (via Word- Net) and the notion of shared collocations (or intercollocability). The results suggest that the combination of all three features outper- forms any single feature or any combination of two features. 1 Collocation in Language Learning The importance and difficulty of collocations for second language users has been widely acknowl- edg...