Paper: Towards Robust Context-Sensitive Sentence Alignment For Monolingual Corpora

ACL ID E06-1021
Title Towards Robust Context-Sensitive Sentence Alignment For Monolingual Corpora
Venue Annual Meeting of The European Chapter of The Association of Computational Linguistics
Session Main Conference
Year 2006
Authors

Aligning sentences belonging to compa- rable monolingual corpora has been sug- gested as a first step towards training text rewriting algorithms, for tasks such as summarization or paraphrasing. We present here a new monolingual sen- tence alignment algorithm, combining a sentence-based TF*IDF score, turned into a probability distribution using logistic re- gression, with a global alignment dynamic programming algorithm. Our approach provides a simpler and more robust solu- tion achieving a substantial improvement in accuracy over existing systems.