Paper: High Throughput Modularized NLP System For Clinical Text

ACL ID P05-3007
Title High Throughput Modularized NLP System For Clinical Text
Venue Annual Meeting of the Association of Computational Linguistics
Session System Demonstration
Year 2005

This paper presents the results of the de- velopment of a high throughput, real time modularized text analysis and information retrieval system that identifies clinically relevant entities in clinical notes, maps the entities to several standardized no- menclatures and makes them available for subsequent information retrieval and data mining. The performance of the system was validated on a small collection of 351 documents partitioned into 4 query topics and manually examined by 3 physicians and 3 nurse abstractors for relevance to the query topics. We find that simple key phrase searching results in 73% recall and 77% precision. A combination of NLP approaches to indexing improve the recall to 92%, while lowering the precision to 67%.