Paper: Integrating Multi-level Linguistic Knowledge with a Unified Framework for Mandarin Speech Recognition

ACL ID D08-1086
Title Integrating Multi-level Linguistic Knowledge with a Unified Framework for Mandarin Speech Recognition
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2008
Authors

To improve the Mandarin large vocabulary continuous speech recognition (LVCSR), a unified framework based approach is intro- duced to exploit multi-level linguistic knowl- edge. In this framework, each knowledge source is represented by a Weighted Finite State Transducer (WFST), and then they are combined to obtain a so-called analyzer for in- tegrating multi-level knowledge sources. Due to the uniform transducer representation, any knowledge source can be easily integrated into the analyzer, as long as it can be encoded into WFSTs. Moreover, as the knowledge in each level is modeled independently and the combination is processed in the model level, the information inherently in each knowledge source has a chance to be thoroughly ex- ploited. By simulations, the effectiveness of the analyz...