Paper: First- and Second-Order Expectation Semirings with Applications to Minimum-Risk Training on Translation Forests

ACL ID D09-1005
Title First- and Second-Order Expectation Semirings with Applications to Minimum-Risk Training on Translation Forests
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2009
Authors

Many statistical translation models can be regarded as weighted logical deduction. Under this paradigm, we use weights from the expectation semiring (Eisner, 2002), to compute first-order statistics (e.g., the ex- pected hypothesis length or feature counts) over packed forests of translations (lat- tices or hypergraphs). We then introduce a novel second-order expectation semir- ing, which computes second-order statis- tics (e.g., the variance of the hypothe- sis length or the gradient of entropy). This second-order semiring is essential for many interesting training paradigms such as minimum risk, deterministic anneal- ing, active learning, and semi-supervised learning, where gradient descent optimiza- tion requires computing the gradient of en- tropy or risk. We use these semirings in an ...