Paper: Automatic Chinese Abbreviation Generation Using Conditional Random Field

ACL ID N09-2069
Title Automatic Chinese Abbreviation Generation Using Conditional Random Field
Venue Human Language Technologies
Session Short Paper
Year 2009
Authors

This paper presents a new method for au- tomatically generating abbreviations for Chi- nese organization names. Abbreviations are commonly used in spoken Chinese, especially for organization names. The generation of Chinese abbreviation is much more complex than English abbreviations, most of which are acronyms and truncations. The abbreviation generation process is formulated as a character tagging problem and the conditional random field (CRF) is used as the tagging model. A carefully selected group of features is used in the CRF model. After generating a list of ab- breviation candidates using the CRF, a length model is incorporated to re-rank the candi- dates. Finally the full-name and abbreviation co-occurrence information from a web search engine is utilized to further improve the pe...