Paper: Graph-Based Clustering for Semantic Classification of Onomatopoetic Words

ACL ID W08-2005
Title Graph-Based Clustering for Semantic Classification of Onomatopoetic Words
Venue Coling 2008: Proceedings of the workshop on Speech Processing for Safety Critical Translation and Pervasive Applications
Session
Year 2008
Authors

This paper presents a method for seman- tic classication of onomatopoe tic words like “uni3072.891uni3085.910uni30FC.660uni3072.891uni3085.910uni30FC.660 (hum)” and “uni304B.852uni3089.914uni3093.924 uni3053.860uni308D.918uni3093.924 (clip clop)” which exist in ev- ery language, especially Japanese being rich in onomat opoetic words. We used a graph-based clustering algorithm called Newman clustering. The algorithm cal- culates a simple quality function to test whether a particular division is meaning - ful. The quality function is calculated based on the weights of edges between nodes. We combin ed two different sim- ilarity measure s, distributional similarit y, and orthographic similarity to calculate weights . The results obtained by using the Web data showed a 9.0% improvem...