Paper: Predicting Chinese Abbreviations with Minimum Semantic Unit and Global Constraints

ACL ID D14-1147
Title Predicting Chinese Abbreviations with Minimum Semantic Unit and Global Constraints
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2014
Authors

We propose a new Chinese abbreviation prediction method which can incorporate rich local information while generating the abbreviation globally. Different to previ- ous character tagging methods, we intro- duce the minimum semantic unit, which is more fine-grained than character but more coarse-grained than word, to capture word level information in the sequence labeling framework. To solve the ?character dupli- cation? problem in Chinese abbreviation prediction, we also use a substring tagging strategy to generate local substring tagging candidates. We use an integer linear pro- gramming (ILP) formulation with various constraints to globally decode the final ab- breviation from the generated candidates. Experiments show that our method outper- forms the state-of-the-art systems, without u...