Paper: Robust German Noun Chunking With A Probabilistic Context-Free Grammar

ACL ID C00-2105
Title Robust German Noun Chunking With A Probabilistic Context-Free Grammar
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2000
Authors

We present a noun chunker for German which is based on a head-lexicalised probabilistic context- fl'ee grammar. A manually developed grammar was semi-automatically extended with robustness rules in order to allow parsing of unrestricted text. Tile model parmncters were learned from unlabellcd training data by a probabilistic context-fl'ee parser. For extracting noun chunks, the parser generates all possible noun chunk analyses, scores them with a novel algorithm which maximizes tile best chunk sequence criterion, and chooscs the most probable chunk sequence. An evaluation of the chunker on 2,140 hand-annotated noun chunks yielded 92% re- call and 93% precision.