Paper: Large Language Models in Machine Translation

ACL ID D07-1090
Title Large Language Models in Machine Translation
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2007
Authors

This paper reports on the benefits of large- scale statistical language modeling in ma- chine translation. A distributed infrastruc- ture is proposed which we use to train on up to 2 trillion tokens, resulting in language models having up to 300 billion n-grams. It is capable of providing smoothed probabil- ities for fast, single-pass decoding. We in- troduce a new smoothing method, dubbed Stupid Backoff, that is inexpensive to train on large data sets and approaches the quality of Kneser-Ney Smoothing as the amount of training data increases.