Paper: Tagging With Hidden Markov Models Using Ambiguous Tags

ACL ID C04-1082
Title Tagging With Hidden Markov Models Using Ambiguous Tags
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2004

Part of speech taggers based on Hidden Markov Models rely on a series of hypothe- ses which make certain errors inevitable. The idea developed in this paper consists in allowing a limited, controlled ambiguity in the output of the tagger in order to avoid a number of errors. The ambiguity takes the form of ambiguous tags which denote subsets of the tagset. These tags are used when the tagger hesitates between the dif- ferent components of the ambiguous tags. They are introduced in an existing lexicon and 3-gram database. Their lexical and syntactic counts are computed on the basis of the lexical and syntactic counts of their constituents, using impurity functions. The tagging process itself, based on the Viterbi algorithm, is unchanged. Experiments con- ducted on the Brown corpus show a re...