Paper: Combining Lexical And Formatting Cues For Named Entity Acquisition From The Web

ACL ID W00-1323
Title Combining Lexical And Formatting Cues For Named Entity Acquisition From The Web
Venue 2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora
Session Main Conference
Year 2000
Authors

Because of their constant renewal, it is nec- essary to acquire fresh named entities (NEs) from recent text sources. We present a tool for the acquisition and the typing of NEs from the Web that associates a harvester and three parallel shallow parsers dedicated to specific structures (lists, enumerations, and anchors). The parsers combine lexical indices such as discourse markers with formatting instruc- tions (HTML tags) for analyzing enumera- tions and associated initializers.