Paper: Who is Who and What is What: Experiments in Cross-Document Co-Reference

ACL ID D08-1029
Title Who is Who and What is What: Experiments in Cross-Document Co-Reference
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2008
Authors

This paper describes a language-independent, scalable system for both challenges of cross- document co-reference: name variation and entity disambiguation. We provide system re- sults from the ACE 2008 evaluation in both English and Arabic. Our English system’s ac- curacy is 8.4% relative better than an exact match baseline (and 14.2% relative better over entities mentioned in more than one docu- ment). Unlike previous evaluations, ACE 2008 evaluated both name variation and entity disambiguation over naturally occurring named mentions. An information extraction engine finds document entities in text. We de- scribe how our architecture designed for the 10K document ACE task is scalable to an even larger corpus. Our cross-document ap- proach uses the names of entities to find...