Paper: Analysis Of Selective Strategies To Build A Dependency-Analyzed Corpus

ACL ID P06-2082
Title Analysis Of Selective Strategies To Build A Dependency-Analyzed Corpus
Venue Annual Meeting of the Association of Computational Linguistics
Session Poster Session
Year 2006
Authors
  • Kiyonori Ohtake (ATR Spoken Language Communication Research Laboratories, Kyoto Japan)

This paper discusses sampling strategies for building a dependency-analyzed cor- pus and analyzes them with different kinds of corpora. We used the Kyoto Text Corpus, a dependency-analyzed corpus of newspaper articles, and prepared the IPAL corpus, a dependency-analyzed corpus of example sentences in dictionaries, as a new and different kind of corpus. The ex- perimental results revealed that the length of the test set controlled the accuracy and that the longest-first strategy was good for an expanding corpus, but this was not the case when constructing a corpus from scratch.