Paper: A Spelling Correction Program Based On A Noisy Channel Model

ACL ID C90-2036
Title A Spelling Correction Program Based On A Noisy Channel Model
Venue International Conference on Computational Linguistics
Session Main Conference
Year 1990
Authors

This paper describes a new program, correct, which takes words rejected by the Unix® spell program, proposes a list of candidate corrections, and sorts them by probability. The probability scores are the novel contribution of this work. Probabilities are based on a noisy channel model. It is assumed that the typist knows what words he or she wants to type but some noise is added on the way to the keyboard (in the form of typos and spelling errors). Using a classic Bayesian argument of the kind that is popular in the speech recognition literature (Jelinek, 1985), one can often recover the intended correction, c, from a typo, t, by finding the correction c that maximizes Pr(c)Pr(tlc). The first factor, Pr(c), is a prior model of word probabilities; the second factor, Pr(t[c), is a model of ...