Paper: A Memory-Based Approach to Learning Shallow Natural Language Patterns

ACL ID P98-1010
Title A Memory-Based Approach to Learning Shallow Natural Language Patterns
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 1998
Authors

Recognizing shallow linguistic patterns, such as ba- sic syntactic relationships between words, is a com- mon task in applied natural language and text pro- cessing. The common practice for approaching this task is by tedious manual definition of possible pat- tern structures, often in the form of regular expres- sions or finite automata. This paper presents a novel memory-based learning method that recognizes shal- low patterns in new text based on a bracketed train- ing corpus. The training data are stored as-is, in efficient suffix-tree data structures. Generalization is performed on-line at recognition time by compar- ing subsequences of the new text to positive and negative evidence in the corpus. This way, no in- formation in the training is lost, as can happen in other learning syst...