Paper: Segment-Based Hidden Markov Models For Information Extraction

ACL ID P06-1061
Title Segment-Based Hidden Markov Models For Information Extraction
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2006
Authors

Hidden Markov models (HMMs) are pow- erful statistical models that have found successful applications in Information Ex- traction (IE). In current approaches to ap- plying HMMs to IE, an HMM is used to model text at the document level. This modelling might cause undesired redun- dancy in extraction in the sense that more than one filler is identified and extracted. We propose to use HMMs to model text at the segment level, in which the extrac- tion process consists of two steps: a seg- ment retrieval step followed by an extrac- tion step. In order to retrieve extraction- relevant segments from documents, we in- troduce a method to use HMMs to model and retrieve segments. Our experimen- tal results show that the resulting segment HMM IE system not only achieves near zero extraction redundan...