Paper: Normalizing SMS: are Two Metaphors Better than One ?

ACL ID C08-1056
Title Normalizing SMS: are Two Metaphors Better than One ?
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2008

Electronic written texts used in computer- mediated interactions (e-mails, blogs, chats, etc) present major deviations from the norm of the language. This paper presents an comparative study of systems aiming at normalizing the orthography of French SMS messages: after discussing the linguistic peculiarities of these mes- sages, and possible approaches to their au- tomatic normalization, we present, evalu- ate and contrast two systems, one draw- ing inspiration from the Machine Transla- tion task; the other using techniques that are commonly used in automatic speech recognition devices. Combining both ap- proaches, our best normalization system achieves about 11% Word Error Rate on a test set of about 3000 unseen messages.