Paper: Web and Corpus Methods for Malay Count Classifier Prediction

ACL ID N09-2018
Title Web and Corpus Methods for Malay Count Classifier Prediction
Venue Human Language Technologies
Session Short Paper
Year 2009
Authors

We examine the capacity of Web and corpus frequency methods to predict preferred count classifiers for nouns in Malay. The observed F-score for the Web model of 0.671 consid- erably outperformed corpus-based frequency and machine learning models. We expect that this is a fruitful extension for Web–as–corpus approaches to lexicons in languages other than English, but further research is required in other South-East and East Asian languages.