Paper: Learning Discourse Relations With Active Data Selection

ACL ID W99-0620
Title Learning Discourse Relations With Active Data Selection
Venue 2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora
Session Main Conference
Year 1999

The paper presents a new approach to identi- fying discourse relations, which makes use of a particular sampling method called committee- based sampling (CBS). In the committee-based sampling, multiple learning models are gener- ated to measure the utility of an input example in classification; if it is judged as not useful, then the example will be ignored. The method has the effect of reducing the amount of data required for training. In the paper, we extend CBS for decision tree classifiers. With an addi- tional extension called error feedback, it is found that the method achieves an increased accuracy as well as a substantial reduction in the amount of data for training classifiers.