Paper: A Corpus-Based Approach To Topic In Danish Dialog

ACL ID P05-2019
Title A Corpus-Based Approach To Topic In Danish Dialog
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2005

We report on an investigation of the prag- matic category of topic in Danish dia- log and its correlation to surface features of NPs. Using a corpus of 444 utter- ances, we trained a decision tree system on 16 features. The system achieved near- human performance with success rates of 84–89% and F1-scores of 0.63–0.72 in 10- fold cross validation tests (human perfor- mance: 89% and 0.78). The most im- portant features turned out to be prever- bal position, definiteness, pronominalisa- tion, and non-subordination. We discov- ered that NPs in epistemic matrix clauses (e.g. “I think. . . ”) were seldom topics and we suspect that this holds for other inter- personal matrix clauses as well.