Paper: A Provably Correct Learning Algorithm for Latent-Variable PCFGs

ACL ID P14-1099
Title A Provably Correct Learning Algorithm for Latent-Variable PCFGs
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2014
Authors

We introduce a provably correct learning algorithm for latent-variable PCFGs. The algorithm relies on two steps: first, the use of a matrix-decomposition algorithm ap- plied to a co-occurrence matrix estimated from the parse trees in a training sample; second, the use of EM applied to a convex objective derived from the training sam- ples in combination with the output from the matrix decomposition. Experiments on parsing and a language modeling problem show that the algorithm is efficient and ef- fective in practice.