Paper: Toward General-Purpose Learning for Information Extraction

ACL ID P98-1067
Title Toward General-Purpose Learning for Information Extraction
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 1998

Two trends are evident in the recent evolution of the field of information extraction: a preference for simple, often corpus-driven techniques over linguistically sophisticated ones; and a broaden- ing of the central problem definition to include many non-traditional text domains. This devel- opment calls for information extraction systems which are as retctrgetable and general as possi- ble. Here, we describe SRV, a learning archi- tecture for information extraction which is de- signed for maximum generality and flexibility. SRV can exploit domain-specific information, including linguistic syntax and lexical informa- tion, in the form of features provided to the sys- tem explicitly as input for training. This pro- cess is illustrated using a domain created from Reuters corporate acquisiti...