Paper: Automatic Grammar Acquisition

ACL ID H94-1051
Title Automatic Grammar Acquisition
Venue Human Language Technologies
Session Main Conference
Year 1994

We describe a series of three experiments in which supervised learning techniques were used to acquire three different types of grammars for English news stories. The acquired grammar types were: 1) context-free, 2) context-dependent, and 3) probabilistic context-free. Training data were derived from University of Pennsylvania Treebank parses of 50 Wall Street Journal articles. In each case, the system started with essentially no grammatical knowledge, and learned a set of grammar rules exclusively from the training data. Performance for each gr~rnar type was then evaluated on an independent set of test sentences using Parseval, a standard measure of parsing accuracy. These experimental results yield a direct qtmntitative comparison between each of the three methods.