Paper: Fast Statistical Parsing Of Noun Phrases For Document Indexing

ACL ID A97-1046
Title Fast Statistical Parsing Of Noun Phrases For Document Indexing
Venue Applied Natural Language Processing Conference
Session Main Conference
Year 1997
Authors

Information Retrieval (IR) is an important application area of Natural Language Pro- cessing (NLP) where one encounters the genuine challenge of processing large quan- tities of unrestricted natural language text. While much effort has been made to apply NLP techniques to IR, very few NLP tech- niques have been evaluated on a document collection larger than several megabytes. Many NLP techniques are simply not ef- ficient enough, and not robust enough, to handle a large amount of text. This pa- per proposes a new probabilistic model for noun phrase parsing, and reports on the application of such a parsing technique to enhance document indexing. The effective- ness of using syntactic phrases provided by the parser to supplement single words for indexing is evaluated with a 250 megabytes doc...