Paper: Lexicalized Markov Grammars for Sentence Compression

ACL ID N07-1023
Title Lexicalized Markov Grammars for Sentence Compression
Venue Human Language Technologies
Session Main Conference
Year 2007

We present a sentence compression system based on synchronous context-free grammars (SCFG), following the successful noisy-channel approach of (Knight and Marcu, 2000). We define a head- driven Markovization formulation of SCFG dele- tion rules, which allows us to lexicalize probabili- ties of constituent deletions. We also use a robust approach for tree-to-tree alignment between arbi- trarydocument-abstractparallelcorpora,whichlets us train lexicalized models with much more data than previous approaches relying exclusively on scarcely available document-compression corpora. Finally, we evaluate different Markovized models, and find that our selected best model is one that ex- ploits head-modifier bilexicalization to accurately distinguish adjuncts from complements, and that produces sente...