ACL Anthology Network (All About NLP) (beta) The Association Of Computational Linguistics Anthology Network |
ACL ID | A97-1046 |
---|---|
Title | Fast Statistical Parsing Of Noun Phrases For Document Indexing |
Venue | Applied Natural Language Processing Conference |
Session | Main Conference |
Year | 1997 |
Authors |
|
Information Retrieval (IR) is an important application area of Natural Language Pro- cessing (NLP) where one encounters the genuine challenge of processing large quan- tities of unrestricted natural language text. While much effort has been made to apply NLP techniques to IR, very few NLP tech- niques have been evaluated on a document collection larger than several megabytes. Many NLP techniques are simply not ef- ficient enough, and not robust enough, to handle a large amount of text. This pa- per proposes a new probabilistic model for noun phrase parsing, and reports on the application of such a parsing technique to enhance document indexing. The effective- ness of using syntactic phrases provided by the parser to supplement single words for indexing is evaluated with a 250 megabytes doc...