Paper: Unsupervised Information Extraction with Distributional Prior Knowledge

ACL ID D11-1075
Title Unsupervised Information Extraction with Distributional Prior Knowledge
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2011
Authors

We address the task of automatic discovery of information extraction template from a given text collection. Our approach clusters candi- date slot fillers to identify meaningful tem- plate slots. We propose a generative model that incorporates distributional prior knowl- edge to help distribute candidates in a docu- ment into appropriate slots. Empirical results suggest that the proposed prior can bring sub- stantial improvements to our task as compared to a K-means baseline and a Gaussian mixture model baseline. Specifically, the proposed prior has shown to be effective when coupled with discriminative features of the candidates.