Paper: Seeded Discovery of Base Relations in Large Corpora

ACL ID D08-1062
Title Seeded Discovery of Base Relations in Large Corpora
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2008

Relationship discovery is the task of iden- tifying salient relationships between named entities in text. We propose novel approaches for two sub-tasks of the problem: identifying the entities of interest, and partitioning and describing the relations based on their semantics. In particular, we show that term frequency patterns can be used effectively instead of supervised NER, and that the p- median clustering objective function naturally uncovers relation exemplars appropriate for describing the partitioning. Furthermore, we introduce a novel application of relationship discovery: the unsupervised identification of protein-protein interaction phrases.