Paper: Relative Rank Statistics for Dialog Analysis

ACL ID D08-1101
Title Relative Rank Statistics for Dialog Analysis
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2008
  • Juan Huerta (IBM T.J. Watson Research Center, Yorktown Heights NY)

We introduce the relative rank differential sta- tistic which is a non-parametric approach to document and dialog analysis based on word frequency rank-statistics. We also present a simple method to establish semantic saliency in dialog, documents, and dialog segments using these word frequency rank statistics. Applica- tions of our technique include the dynamic tracking of topic and semantic evolution in a dialog, topic detection, automatic generation of document tags, and new story or event detec- tion in conversational speech and text. Our ap- proach benefits from the robustness, simplicity and efficiency of non-parametric and rank based approaches and consistently outper- formed term-frequency and TF-IDF cosine dis- tance approaches in several experiments con- ducted. 1 Ba...