Paper: Examining The Content Load Of Part Of Speech Blocks For Information Retrieval

ACL ID P06-2069
Title Examining The Content Load Of Part Of Speech Blocks For Information Retrieval
Venue Annual Meeting of the Association of Computational Linguistics
Session Poster Session
Year 2006
Authors

We investigate the connection between part of speech (POS) distribution and con- tent in language. We define POS blocks to be groups of parts of speech. We hypo- thesise that there exists a directly propor- tional relation between the frequency of POS blocks and their content salience. We also hypothesise that the class membership of the parts of speech within such blocks reflects the content load of the blocks, on the basis that open class parts of speech are more content-bearing than closed class parts of speech. We test these hypothe- ses in the context of Information Retrieval, by syntactically representing queries, and removing from them content-poor blocks, in line with the aforementioned hypothe- ses. For our first hypothesis, we induce POS distribution information from a cor- pus, ...