Paper: Hybrid Models for Semantic Classification of Chinese Unknown Words

ACL ID N07-1024
Title Hybrid Models for Semantic Classification of Chinese Unknown Words
Venue Human Language Technologies
Session Main Conference
Year 2007
Authors
  • Xiaofei Lu (Pennsylvania State University, State College PA)

This paper addresses the problem of clas- sifying Chinese unknown words into fine-grained semantic categories defined in a Chinese thesaurus. We describe three novel knowledge-based models that capture the relationship between the se- mantic categories of an unknown word and those of its component characters in three different ways. We then combine two of the knowledge-based models with a corpus-based model which classifies unknown words using contextual infor- mation. Experiments show that the knowledge-based models outperform previous methods on the same task, but the use of contextual information does not further improve performance.