Paper: Inducing Compact but Accurate Tree-Substitution Grammars

ACL ID N09-1062
Title Inducing Compact but Accurate Tree-Substitution Grammars
Venue Human Language Technologies
Session Main Conference
Year 2009
Authors

Tree substitution grammars (TSGs) are a com- pelling alternative to context-free grammars for modelling syntax. However, many popu- lar techniques for estimating weighted TSGs (under the moniker of Data Oriented Parsing) suffer from the problems of inconsistency and over-fitting. We present a theoretically princi- pled model which solves these problems us- ing a Bayesian non-parametric formulation. Our model learns compact and simple gram- mars, uncovering latent linguistic structures (e.g., verb subcategorisation), and in doing so far out-performs a standard PCFG.