Paper: Story tracking: linking similar news over time and across languages

ACL ID W08-1408
Title Story tracking: linking similar news over time and across languages
Venue Coling 2008: Proceedings of the workshop Multi-source Multilingual Information Extraction and Summarization
Session
Year 2008
Authors

The Europe Media Monitor system (EMM) gathers and aggregates an aver- age of 50,000 newspaper articles per day in over 40 languages. To manage the in- formation overflow, it was decided to group similar articles per day and per language into clusters and to link daily clusters over time into stories. A story automatically comes into existence when related groups of articles occur within a 7-day window. While cross-lingual links across 19 languages for individual news clusters have been displayed since 2004 as part of a freely accessible online appli- cation (http://press.jrc.it/NewsExplorer), the newest development is work on link- ing entire stories across languages. The evaluation of the monolingual aggrega- tion of historical clusters into stories and of the linking of st...