Paper: Learning New Semi-Supervised Deep Auto-encoder Features for Statistical Machine Translation

ACL ID P14-1012
Title Learning New Semi-Supervised Deep Auto-encoder Features for Statistical Machine Translation
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2014
Authors

In this paper, instead of designing new fea- tures based on intuition, linguistic knowl- edge and domain, we learn some new and effective features using the deep auto- encoder (DAE) paradigm for phrase-based translation model. Using the unsupervised pre-trained deep belief net (DBN) to ini- tialize DAE?s parameters and using the in- put original phrase features as a teacher for semi-supervised fine-tuning, we learn new semi-supervised DAE features, which are more effective and stable than the unsuper- vised DBN features. Moreover, to learn high dimensional feature representation, we introduce a natural horizontal compo- sition of more DAEs for large hidden lay- ers feature learning. On two Chinese- English tasks, our semi-supervised DAE features obtain statistically significant im- proveme...