Paper: A Broad Evaluation of Techniques for Automatic Acquisition of Multiword Expressions

ACL ID W12-3301
Title A Broad Evaluation of Techniques for Automatic Acquisition of Multiword Expressions
Venue Annual Meeting of the Association of Computational Linguistics
Session Student Session
Year 2012
Authors

Several approaches have been proposed for the au- tomatic acquisition of multiword expressions from corpora. However, there is no agreement about which of them presents the best cost-benefit ratio, as they have been evaluated on distinct datasets and/or languages. To address this issue, we investigate these techniques analysing the following dimen- sions: expression type (compound nouns, phrasal verbs), language (English, French) and corpus size. Results show that these techniques tend to extract similar candidate lists with high recall (? 80%) for nominals and high precision (? 70%) for verbals. The use of association measures for candidate filter- ing is useful but some of them are more onerous and not significantly better than raw counts. We finish with an evaluation of flexibility and ...