Paper: Robust Text Processing In Automated Information Retrieval

ACL ID A94-1028
Title Robust Text Processing In Automated Information Retrieval
Venue Applied Natural Language Processing Conference
Session Main Conference
Year 1994
Authors

We report on the results of a series of experiments with a prototype text retrieval system which uses relatively advanced natural language processing techniques in order to enhance the effectiveness of statistical docu- ment retrieval. In this paper we show that large-scale natural language processing (hundreds of millions of words and more) is not only required for a better retrieval, but it is also doable, given appropriate resources. In particular, we demonstrate that the use of syntactic compounds in the representation of database documents as well as in the user queries, coupled with an appropriate term weighting strategy, can consider- ably improve the effectiveness of retrospective search. The experiments reported here were conducted on TIP- STER database in connection with the Text...