Paper: Towards Gene Recognition from Rare and Ambiguous Abbreviations using a Filtering Approach

ACL ID W14-3418
Title Towards Gene Recognition from Rare and Ambiguous Abbreviations using a Filtering Approach
Venue Proceedings of the BioNLP Shared Task 2013 Workshop
Session
Year 2014
Authors

Retrieving information about highly am- biguous gene/protein homonyms is a chal- lenge, in particular where their non-protein meanings are more frequent than their pro- tein meaning (e. g., SAH or HF). Due to their limited coverage in common bench- marking data sets, the performance of exist- ing gene/protein recognition tools on these problematic cases is hard to assess. We uniformly sample a corpus of eight am- biguous gene/protein abbreviations from MEDLINEr and provide manual annota- tions for each mention of these abbrevia- tions. 1 Based on this resource, we show that available gene recognition tools such as conditional random fields (CRF) trained on BioCreative 2 NER data or GNAT tend to underperform on this phenomenon. We propose to extend existing gene recog- nition approaches by ...