Paper: Minimum Risk Annealing For Training Log-Linear Models

ACL ID P06-2101
Title Minimum Risk Annealing For Training Log-Linear Models
Venue Annual Meeting of the Association of Computational Linguistics
Session Poster Session
Year 2006

When training the parameters for a natural language system, one would prefer to minimize 1-best loss (error) on an eval- uation set. Since the error surface for many natural language problems is piecewise constant and riddled with local min- ima, many systems instead optimize log-likelihood, which is conveniently differentiable and convex. We propose training instead to minimize the expected loss, or risk. We define this expectation using a probability distribution over hypotheses that we gradually sharpen (anneal) to focus on the 1-best hy- pothesis. Besides the linear loss functions used in previous work, we also describe techniques for optimizing nonlinear functions such as precision or the BLEU metric. We present experiments training log-linear combinations of models for dependency par...