Paper: Automatic Acquisition Of Subcategorization Frames From Untagged Text

ACL ID P91-1027
Title Automatic Acquisition Of Subcategorization Frames From Untagged Text
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 1991
Authors

This paper describes an implemented program that takes a raw, untagged text corpus as its only input (no open-class dictionary) and gener- ates a partial list of verbs occurring in the text and the subcategorization frames (SFs) in which they occur. Verbs are detected by a novel tech- nique based on the Case Filter of Rouvret and Vergnaud (1980). The completeness of the output list increases monotonically with the total number of occurrences of each verb in the corpus. False positive rates are one to three percent of observa- tions. Five SFs are currently detected and more are planned. Ultimately, I expect to provide a large SF dictionary to the NLP community and to train dictionaries for specific corpora.