Paper: Named Entity Recognition Using A Character-Based Probabilistic Approach

ACL ID W03-0432
Title Named Entity Recognition Using A Character-Based Probabilistic Approach
Venue International Conference on Computational Natural Language Learning
Session Main Conference
Year 2003
Authors

We present a named entity recognition and classification system that uses only probabilis- tic character-level features. Classifications by multiple orthographic tries are combined in a hidden Markov model framework to incorpo- rate both internal and contextual evidence. As part of the system, we perform a preprocess- ing stage in which capitalisation is restored to sentence-initial and all-caps words with high accuracy. We report f-values of 86.65 and 79.78 for English, and 50.62 and 54.43 for the German datasets.