Paper: Inducing Gazetteers for Named Entity Recognition by Large-Scale Clustering of Dependency Relations

ACL ID P08-1047
Title Inducing Gazetteers for Named Entity Recognition by Large-Scale Clustering of Dependency Relations
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2008
Authors
  • Jun'ichi Kazama (Japan Advanced Institute of Science and Technology, Nomi Japan)
  • Kentaro Torisawa (National Institute of Information and Communications Technology, Kyoto Japan)

We propose using large-scale clustering of de- pendency relations between verbs and multi- word nouns (MNs) to construct a gazetteer for named entity recognition (NER). Since depen- dency relations capture the semantics of MNs well, the MN clusters constructed by using dependency relations should serve as a good gazetteer. However,thehighlevelofcomputa- tional cost has prevented the use of clustering for constructing gazetteers. We parallelized a clustering algorithm based on expectation- maximization (EM) and thus enabled the con- struction of large-scale MN clusters. We demonstrated with the IREX dataset for the Japanese NER that using the constructed clus- tersasagazetteer(clustergazetteer)isaeffec- tive way of improving the accuracy of NER. Moreover, we demonstrate that the combina- ti...