Paper: Annotating Multiple Types Of Biomedical Entities: A Single Word Classification Approach

ACL ID W04-1215
Title Annotating Multiple Types Of Biomedical Entities: A Single Word Classification Approach
Venue International Joint Workshop On Natural Language Processing In Biomedicine And Its Applications NLPBA BioNLP
Session
Year 2004
Authors

Named entity recognition is a fundamental task in biomedical data mining. Multiple -class annotation is more challenging than single - cla ss annotation. In this paper, we took a single word classification approach to dealing with the multiple -class annotation problem using Support Vector Machines (SVMs). Word attributes, results of existing gene/protein name taggers, context, and other information are important features for classification. During training, the size of training data and the distribution of named entities are considered. The preliminary results showed that the approach might be feasible when more training data is used to alleviate the data imbalance problem.