Paper: A Computer Readability Formula Of Japanese Texts For Machine Scoring

ACL ID C88-2135
Title A Computer Readability Formula Of Japanese Texts For Machine Scoring
Venue International Conference on Computational Linguistics
Session Main Conference
Year 1988
Authors

A readability formula is obtained that can be used by com- puter programs for style checking of Japanese texts and need not syntactic or semantic information. The formula is derived as a linear combination of tile surface characteristics of the text that are related to its readability: (1) the average number of characters per sentence, (2) for each type of characters (Roman alphabets, kanzis, hiraganas, katakanas), relative frequencies of rims (maximal swings) that,:onsists only of that type of characters, (3) the average number of characters per each type of runs, and (4) tooten (comma) to kuten (period) ratio. To find the proper weighting, principal component analysis (PCA) was appliedto these characteristics taken from 77 sample texts. We have found a component which is related to the r...