Paper: Demonstration Of The CROSSMARC System

ACL ID N03-4007
Title Demonstration Of The CROSSMARC System
Venue Human Language Technologies
Session System Demonstration
Year 2003

As shown in Figure 1, the CROSSMARC multi-agent architecture includes agents for web page collection (crawling agent, spidering agent), information extraction, data storage and data presentation. These agents communicate through the blackboard. The Crawling Agent defines a schedule for invoking the focused crawler which is Edmonton, May-June 2003 Demonstrations, pp. 13-14 Proceedings of HLT-NAACL 2003 Figure 1: Architecture of the CROSSMARC system written to the blackboard and can be refined by the human administrator. The Spidering Agent is an autonomous software component, which retrieves sites to spider from the blackboard and locates interesting web pages within them by traversing their links. Again, status information is written to the blackboard. The multi-lingual IE system is a dist...