Paper: How Verb Subcategorization Frequencies are Affected by Corpus Choice

ACL ID P98-2184
Title How Verb Subcategorization Frequencies are Affected by Corpus Choice
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 1998
Authors

The probabilistic relation between verbs and their arguments plays an important role in modern statistical parsers and supertaggers, and in psychological theories of language processing. But these probabilities are computed in very different ways by the two sets of researchers. Computational linguists compute verb subcategorization probabilities from large corpora while psycholinguists compute them from psychological studies (sentence production and completion tasks). Recent studies have found differences between corpus frequencies and psycholinguistic measures. We analyze subcategorization frequencies from four different corpora: psychological sentence production data (Connine et al. 1984), written text (Brown and WSJ), and telephone conversation data (Switchboard). We find two different ...