Paper: A Bayesian Model for Learning SCFGs with Discontiguous Rules

ACL ID D12-1021
Title A Bayesian Model for Learning SCFGs with Discontiguous Rules
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2012
Authors

We describe a nonparametric model and corresponding inference algorithm for learning Synchronous Context Free Grammar derivations for parallel text. The model employs a Pitman-Yor Process prior which uses a novel base distribution over synchronous grammar rules. Through both synthetic grammar induction and statistical machine translation experiments, we show that our model learns complex translational correspondences? including discontiguous, many-to-many alignments?and produces competitive translation results. Further, inference is efficient and we present results on significantly larger corpora than prior work.