Paper: Hypothesizing Word Association From Untagged Text

ACL ID H93-1049
Title Hypothesizing Word Association From Untagged Text
Venue Human Language Technologies
Session Main Conference
Year 1993
Authors

This paper reports a new method for suggesting word associations, based on a greedy algorithm that employs Chi- square statistics on joint frequencies of pairs of word groups compared against chance co-occurrence. The benefits of this new approach are: 1) we can consider even low frequency words and word pairs, and 2) word groups and word associations can be automatically generated. The method provided 87% accuracy in hypothesizing word associations for unobserved combinations of words in Japanese text.