Paper: A Memory-Based Approach to Learning Shallow Natural Language Patterns

ACL ID C98-1010
Title A Memory-Based Approach to Learning Shallow Natural Language Patterns
Venue International Conference on Computational Linguistics
Session Main Conference
Year 1998
Authors

Recognizing shallow linguistic patterns, such as ba- sic syntactic relationships between words, is a com~ mon task in applied natural language and text pro- (:essing. Tile common practice for approaching this task is by tedious manual definition of possible pat- tern structures, often in the h)rm of regular expres- sions or finite automata. This paper presents a novel memory-based learning method that recognizes shal- low patterns in new text based on a bracketed train- ing corpus. The training data are stored as-is, in efficient suttix-tree data structures. Generalization is performed on-line at recognition time by compar- ing subsequences of the new text to positive and negative evidence in the corIms. This way, no in- formation in tit(; training is lost, as can happen in ...