Paper: Distant supervision for relation extraction without labeled data

ACL ID P09-1113
Title Distant supervision for relation extraction without labeled data
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2009
Authors

Modern models of relation extraction for tasks like ACE are based on supervised learning of relations from small hand-labeled corpora. We investigate an alternative paradigm that does not require labeled corpora, avoiding the domain dependence of ACE- style algorithms, and allowing the use of corpora of any size. Our experiments use Freebase, a large semantic database of several thousand relations, to provide distant supervision. For each pair of enti- ties that appears in some Freebase relation, we find all sentences containing those entities in a large un- labeled corpus and extract textual features to train a relation classifier. Our algorithm combines the advantages of supervised IE (combining 400,000 noisy pattern features in a probabilistic classifier) and unsupervised IE (extracting...