Paper: Lower And Higher Estimates Of The Number Of true Analogies Between Sentences Contained In A Large Multilingual Corpus

ACL ID C04-1106
Title Lower And Higher Estimates Of The Number Of true Analogies Between Sentences Contained In A Large Multilingual Corpus
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2004
Authors
  • Yves Lepage (ATR Spoken Language Translation Research Laboratories, Kyoto Japan)

The reality of analogies between words is re- futed by noone (e.g. , I walked is to to walk as I laughed is to to laugh, noted I walked : to walk :: I laughed : to laugh). But computational lin- guists seem to be quite dubious about analogies between sentences: they would not be enough numerous to be of any use. We report experi- ments conducted on a multilingual corpus to es- timate the number of analogies among the sen- tences that it contains. We give two estimates, a lower one and a higher one. As an analogy must be valid on the level of form as well as on the level of meaning, we relied on the idea that translation should preserve meaning to test for similar meanings.