Paper: Discovering Relations Between Named Entities from a Large Raw Corpus Using Tree Similarity-Based Clustering

ACL ID I05-1034
Title Discovering Relations Between Named Entities from a Large Raw Corpus Using Tree Similarity-Based Clustering
Venue International Joint Conference on Natural Language Processing
Session Main Conference
Year 2005
Authors

We propose a tree-similarity-based unsupervised learning method to extract relations between Named Entities from a large raw corpus. Our method regards relation extraction as a clustering problem on shallow parse trees. First, we modify previous tree kernels on relation extraction to estimate the similarity between parse trees more efficiently. Then, the similarity between parse trees is used in a hierarchical clustering algorithm to group entity pairs into different clusters. Finally, each cluster is labeled by an indicative word and unreliable clusters are pruned out. Evaluation on the New York Times (1995) corpus shows that our method outperforms the only previous work by 5 in F-measure. It also shows that our method performs well on both high-frequent and less- frequent entit...