Paper: Joint Learning of Phonetic Units and Word Pronunciations for ASR

ACL ID D13-1019
Title Joint Learning of Phonetic Units and Word Pronunciations for ASR
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2013
Authors

The creation of a pronunciation lexicon re- mains the most inefficient process in develop- ing an Automatic Speech Recognizer (ASR). In this paper, we propose an unsupervised alternative ? requiring no language-specific knowledge ? to the conventional manual ap- proach for creating pronunciation dictionar- ies. We present a hierarchical Bayesian model, which jointly discovers the phonetic inven- tory and the Letter-to-Sound (L2S) mapping rules in a language using only transcribed data. When tested on a corpus of spontaneous queries, the results demonstrate the superior- ity of the proposed joint learning scheme over its sequential counterpart, in which the la- tent phonetic inventory and L2S mappings are learned separately. Furthermore, the recogniz- ers built with the automatically induce...