Paper: Serial Combination Of Rules And Statistics: A Case Study In Czech Tagging

ACL ID P01-1035
Title Serial Combination Of Rules And Statistics: A Case Study In Czech Tagging
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2001
Authors

A hybrid system is described which combines the strength of manual rule- writing and statistical learning, obtain- ing results superior to both methods if applied separately. The combination of a rule-based system and a statistical one is not parallel but serial: the rule-based system performing partial disambigua- tion with recall close to 100% is applied first, and a trigram HMM tagger runs on its results. An experiment in Czech tag- ging has been performed with encour- aging results. 1 Tagging of Inflective Languages Inflective languages pose a specific problem in tagging due to two phenomena: highly inflec- tive nature (causing sparse data problem in any statistically-based system), and free word order (causing fixed-context systems, such as n-gram Hidden Markov Models (HMMs), to be ev...