Paper: Online Entropy-Based Model of Lexical Category Acquisition

ACL ID W10-2922
Title Online Entropy-Based Model of Lexical Category Acquisition
Venue International Conference on Computational Natural Language Learning
Session Main Conference
Year 2010

Children learn a robust representation of lexical categories at a young age. We pro- pose an incremental model of this process which efficiently groups words into lexi- cal categories based on their local context using an information-theoretic criterion. We train our model on a corpus of child- directed speech from CHILDES and show that the model learns a fine-grained set of intuitive word categories. Furthermore, we propose a novel evaluation approach by comparing the efficiency of our induced categories against other category sets (in- cluding traditional part of speech tags) in a variety of language tasks. We show the categories induced by our model typically outperform the other category sets. 1 The Acquisition of Lexical Categories Psycholinguistic studies suggest that early on chil- ...