Paper: Large Scale Collocation Data and Their Application to Japanese Word Processor Technology

ACL ID C98-1110
Title Large Scale Collocation Data and Their Application to Japanese Word Processor Technology
Venue International Conference on Computational Linguistics
Session Main Conference
Year 1998
Authors

Word processors or computers used in Japan employ Japanese input method through key- board stroke combined with Kana (phonetic) character to Kanji (ideographic, Chinese) char- acter conversion technology. The key factor of Kana-to-Kanji conversion technology is how to raise the accuracy of the conversion through the homophone processing, since we have so many homophonic Kanjis. In this paper, we report the results of our Kana-to-Kanji conver- sion experiments which embody the homo- phone processing based on large scale colloca- tion data. It is shown that approximately 135,000 collocations yield 9.1% raise of the conversion accuracy compared with the pro- totype system which has no collocation data,