Paper: Statistical Filtering And Subcategorization Frame Acquisition

ACL ID W00-1325
Title Statistical Filtering And Subcategorization Frame Acquisition
Venue 2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora
Session Main Conference
Year 2000

Research "into the automatic acquisition of subcategorization frames (SCFS) from corpora is starting to produce large-scale computa- tional lexicons which include valuable fre- quency information. However, the accuracy of the resulting lexicons shows room for im- provement. One significant source of error lies in the statistical filtering used by some re- searchers to remove noise from automatically acquired subcategorization frames. In this pa- per, we compare three different approaches to filtering out spurious hypotheses. Two hy- pothesis tests perform poorly, compared to filtering frames on the basis of relative fre- quency. We discuss reasons for this and con- sider directions for future research.