Paper: Using Parsed Corpora For Structural Disambiguation In The TRAINS Domain

ACL ID P96-1046
Title Using Parsed Corpora For Structural Disambiguation In The TRAINS Domain
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 1996
Authors

This paper describes a prototype disam- biguation module, KANKEI, which was tested on two corpora of the TRAINS project. In ambiguous verb phrases of form V... NP PP or V ... NP adverb(s), the two corpora have very different PP and adverb attachment patterns; in the first, the cor- rect attachment is to the VP 88.7% of the time, while in the second, the correct at- tachment is to the NP 73.5% of the time. KANKEI uses various n-gram patterns of the phrase heads around these ambiguities, and assigns parse trees (with these ambigu- ities) a score based on a linear combination of the frequencies with which these pat- terns appear with NP and VP attachments in the TRAINS corpora. Unlike previ- ous statistical disambiguation systems, this technique thus combines evidence from bi- grams, trigrams...