Paper: User-Driven Development of Text Mining Resources for Cancer Risk Assessment

ACL ID W09-1314
Title User-Driven Development of Text Mining Resources for Cancer Risk Assessment
Venue Workshop on Biomedical Natural Language Processing
Session
Year 2009
Authors

One of the most neglected areas of biomed- ical Text Mining (TM) is the development of systems based on carefully assessed user needs. We investigate the needs of an im- portant task yet to be tackled by TM — Can- cer Risk Assessment (CRA) — and take the first step towards the development of TM for the task: identifying and organizing the sci- entific evidence required for CRA in a taxon- omy. The taxonomy is based on expert annota- tion of 1297 MEDLINE abstracts. We report promising results with inter-annotator agree- ment tests and automatic classification experi- ments, and a user test which demonstrates that the resources we have built are well-defined, accurate, and applicable to a real-world CRA scenario. We discuss extending and refining the taxonomy further via manual and machi...