Paper: Statistical Morphological Disambiguation For Agglutinative Languages

ACL ID C00-1042
Title Statistical Morphological Disambiguation For Agglutinative Languages
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2000

In this 1)aper, we present sta.tistical models for morphological disambiguation in Tm'kish. Turkish presents an interesting problem for statistical,nodcls since the potential tag set size is very large because of the productive, derivational morl/hology. Ve pro- pose to handle this by breaking Ul) 1;11(; morhosyn- tactic tags into inflectional groups, each of which contains the inflectional features ti)r each (internm- diate) derived tbrm. Our statistical models score the probability of each morhosyntactic tag by consider- ing statistics over the individual inflection groups in a trigram model. Among the three models that we have deveh)l)ed and tested, (;11(; simplest model ignoring the lo(:al mort)hota(:ties within words l)er- tbrms the best. Ollr })('.st; trigram model 1)erfornls with 93...