Paper: Toward General-Purpose Learning for Information Extraction

ACL ID C98-1064
Title Toward General-Purpose Learning for Information Extraction
Venue International Conference on Computational Linguistics
Session Main Conference
Year 1998

Two trends are evident in the recent evolution of the field of inforlnation extraction: a preference for simple, often corpus-driven techniques over linguistically sophisticated ones; and a broaden- ing of the central problem definition to include many non-traditional text domains. This devel- opment calls for information extraction systems which are as retargetable and general as possi- ble. Here, we describe SRV, a learning archi- tecture for information extraction which is de- signed for maximum generality and flexibility. SRV can exploit domain-specific information, including linguistic syntax and lexical informa- tion, in the form of features provided to the sys- tem explicitly as input for training. This pro- cess is illustrated using a domain created fl'om Reuters c...