Paper: Tagging And Morphological Disambiguation Of Turkish Text

ACL ID A94-1024
Title Tagging And Morphological Disambiguation Of Turkish Text
Venue Applied Natural Language Processing Conference
Session Main Conference
Year 1994

Automatic text tagging is an important component in higher level analysis of text corpora, and its output can be used in many natural language processing applica- tions. In languages like Turkish or Finnish, with agglutinative morphology, morpholog- ical disambiguation is a very crucial pro- cess in tagging, as the structures of many lexical forms are morphologically ambigu- ous. This paper describes a POS tagger for Turkish text based on a full-scale two-level specification of Turkish morphology that is based on a lexicon of about 24,000 root words. This is augmented with a multi- word and idiomatic construct recognizer, and most importantly morphological dis- ambiguator based on local neighborhood constraints, heuristics and limited amount of statistical information. The tagger also has ...