Paper: Hierarchical Bayesian Language Modelling for the Linguistically Informed

ACL ID E12-3008
Title Hierarchical Bayesian Language Modelling for the Linguistically Informed
Venue Annual Meeting of The European Chapter of The Association of Computational Linguistics
Session Student Session
Year 2012
Authors

In this work I address the challenge of aug- menting n-gram language models accord- ing to prior linguistic intuitions. I argue that the family of hierarchical Pitman-Yor language models is an attractive vehicle through which to address the problem, and demonstrate the approach by proposing a model for German compounds. In an em- pirical evaluation, the model outperforms the Kneser-Ney model in terms of perplex- ity, and achieves preliminary improvements in English-German translation.