Paper: Exploring Deterministic Constraints: from a Constrained English POS Tagger to an Efficient ILP Solution to Chinese Word Segmentation

ACL ID P12-1111
Title Exploring Deterministic Constraints: from a Constrained English POS Tagger to an Efficient ILP Solution to Chinese Word Segmentation
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2012
Authors

We show for both English POS tagging and Chinese word segmentation that with proper representation, large number of deterministic constraints can be learned from training exam- ples, and these are useful in constraining prob- abilistic inference. For tagging, learned con- straints are directly used to constrain Viterbi decoding. For segmentation, character-based tagging constraints can be learned with the same templates. However, they are better ap- plied to a word-based model, thus an integer linear programming (ILP) formulation is pro- posed. For both problems, the corresponding constrained solutions have advantages in both efficiency and accuracy.