ACL ID P96-1040
Title The Rhythm Of Lexical Stress In Prose
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 1996

"Prose rhythm" is a widely observed but scarcely quantified phenomenon. We de- scribe an information-theoretic model for measuring the regularity of lexical stress in English texts, and use it in combination with trigram language models to demon- strate a relationship between the probabil- ity of word sequences in English and the amount of rhythm present in them. We find that the stream of lexical stress in text from the Wall Street Journal has an en- tropy rate of less than 0.75 bits per sylla- ble for common sentences. We observe that the average number of syllables per word is greater for rarer word sequences, and to normalize for this effect we run control ex- periments to show that the choice of word order contributes significantly to stress reg- ularity, and increasingly with lexical...