Paper: Semantic class induction and its application for a Chinese voice search system

ACL ID W10-4106
Title Semantic class induction and its application for a Chinese voice search system
Venue Joint Conference on Chinese Language Processing
Session Main Conference
Year 2010
Authors

In this paper, we propose a novel similarity measure based on co-occurrence probabilities for inducing semantic classes. Clustering with the new similarity measure outperformed that with the widely used distance measure based on Kullback-Leibler divergence in precision, recall and F1 evaluation. We then use the induced semantic classes and structures by the new similarity measure to generate in-domain data. At last, we use the generated data to do language model adaptation and improve the result of character recognition from 85.2% to 91%.