Paper: Fast decoding for open vocabulary spoken term detection

ACL ID N09-2070
Title Fast decoding for open vocabulary spoken term detection
Venue Human Language Technologies
Session Short Paper
Year 2009

Information retrieval and spoken-term detec- tion from audio such as broadcast news, tele- phone conversations, conference calls, and meetings are of great interest to the academic, government, and business communities. Mo- tivated by the requirement for high-quality in- dexes, this study explores the effect of using both word and sub-word information to find in-vocabulary and OOV query terms. It also explores the trade-off between search accu- racy and the speed of audio transcription. We present a novel, vocabulary independent, hy- brid LVCSR approach to audio indexing and search and show that using phonetic confu- sions derived from posterior probabilities es- timated by a neural network in the retrieval of OOV queries can help in reducing misses. These methods are evaluated on data set...