Paper: Automatic Allocation of Training Data for Rapid Prototyping of Speech Understanding based on Multiple Model Combination

ACL ID C10-2066
Title Automatic Allocation of Training Data for Rapid Prototyping of Speech Understanding based on Multiple Model Combination
Venue International Conference on Computational Linguistics
Session Poster Session
Year 2010
Authors

The optimal choice of speech understand- ing method depends on the amount of training data available in rapid prototyp- ing. A statistical method is ultimately chosen, but it is not clear at which point in the increase in training data a statisti- cal method become effective. Our frame- work combines multiple automatic speech recognition (ASR) and language under- standing (LU) modules to provide a set of speech understanding results and se- lects the best result among them. The issue is how to allocate training data to statistical modules and the selection mod- ule in order to avoid overfitting in training and obtain better performance. This paper presents an automatic training data alloca- tion method that is based on the change in the coefficients of the logistic regres- sion functions u...