Paper: Improving Mention Detection Robustness to Noisy Input

ACL ID D10-1033
Title Improving Mention Detection Robustness to Noisy Input
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2010
Authors

Information-extraction (IE) research typically focuses on clean-text inputs. However, an IE engine serving real applications yields many false alarms due to less-well-formed input. For example, IE in a multilingual broadcast processing system has to deal with inaccu- rate automatic transcription and translation. The resulting presence of non-target-language text in this case, and non-language mate- rial interspersed in data from other applica- tions, raise the research problem of making IE robust to such noisy input text. We ad- dress one such IE task: entity-mention de- tection. We describe augmenting a statistical mention-detection system in order to reduce false alarms from spurious passages. The di- verse nature of input noise leads us to pursue a multi-faceted approach to robustness. ...