Paper: Grouping Language Model Boundary Words to Speed K–Best Extraction from Hypergraphs

ACL ID N13-1116
Title Grouping Language Model Boundary Words to Speed K–Best Extraction from Hypergraphs
Venue Annual Conference of the North American Chapter of the Association for Computational Linguistics
Session Main Conference
Year 2013
Authors

We propose a new algorithm to approximately extract top-scoring hypotheses from a hyper- graph when the score includes an N?gram language model. In the popular cube prun- ing algorithm, every hypothesis is annotated with boundary words and permitted to recom- bine only if all boundary words are equal. However, many hypotheses share some, but not all, boundary words. We use these com- mon boundary words to group hypotheses and do so recursively, resulting in a tree of hy- potheses. This tree forms the basis for our new search algorithm that iteratively refines groups of boundary words on demand. Ma- chine translation experiments show our algo- rithm makes translation 1.50 to 3.51 times as fast as with cube pruning in common cases.