Paper: Data-Oriented Methods For Grapheme-To-Phoneme Conversion

ACL ID E93-1007
Title Data-Oriented Methods For Grapheme-To-Phoneme Conversion
Venue Annual Meeting of The European Chapter of The Association of Computational Linguistics
Session Main Conference
Year 1993

It is traditionally assumed that various sources of linguistic knowledge and their in- teraction should be formalised in order to be able to convert words into their phone- mic representations with reasonable accu- racy. We show that using supervised learn- ing techniques, based on a corpus of tran- scribed words, the same and even better performance can be achieved, without ex- plicit modeling of linguistic knowledge. In this paper we present two instances of this approach. A first model implements a variant of instance-based learning, in which a weighed similarity metric and a database of prototypical exemplars are used to pre- dict new mappings. In the second model, grapheme-to-phoneme mappings are looked up in a compressed text-to-speech lexicon (table lookup) enriched with default map...