Paper: Cross-Lingual Information to the Rescue in Keyword Extraction

ACL ID P14-5001
Title Cross-Lingual Information to the Rescue in Keyword Extraction
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2014
Authors

We introduce a method that extracts keywords in a language with the help of the other. In our approach, we bridge and fuse conventionally irrelevant word statistics in languages. The method involves estimating preferences for keywords w.r.t. domain topics and generating cross-lingual bridges for word statistics integration. At run-time, we transform parallel articles into word graphs, build cross-lingual edges, and exploit PageRank with word keyness information for keyword extraction. We present the system, BiKEA, that applies the method to keyword analysis. Experiments show that keyword extraction benefits from PageRank, globally learned keyword preferences, and cross-lingual word statistics interaction which respects language diversity.