Paper: Increasing Our Ignorance Of Language: Identifying Language Structure In An Unknown 'Signal'

ACL ID W00-0705
Title Increasing Our Ignorance Of Language: Identifying Language Structure In An Unknown 'Signal'
Venue International Conference on Computational Natural Language Learning
Session Main Conference
Year 2000
Authors

This paper describes algorithms and software developed to characterise and detect generic intelligent language-like features in an input signal, using natural language learning tech- niques: looking for characteristic statistical "language-signatures" in test corpora. As a first step towards such species-independent language-detection, we present a suite of pro- grams to analyse digital representations of a range of data, and use the results to extrap- olate whether or not there are language-like structures which distinguish this data from other sources, such as music, images, and white noise. Outside our own immediate NLP sphere, generic communication techniques are of par- ticular interest in the astronautical community, where two sessions are dedicated to SETI at their annual Internatio...