Paper: Measuring Syntactic Difference in British English

ACL ID P07-3001
Title Measuring Syntactic Difference in British English
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2007

Recent work by Nerbonne and Wiersma (2006) has provided a foundation for mea- suring syntactic differences between cor- pora. It uses part-of-speech trigrams as an approximation to syntactic structure, com- paring the trigrams of two corpora for sta- tistically significant differences. This paper extends the method and its appli- cation. It extends the method by using leaf- path ancestors of Sampson (2000) instead of trigrams, which capture internal syntactic structure—every leaf in a parse tree records the path back to the root. The corpus used for testing is the Interna- tional Corpus of English, Great Britain (Nel- son et al. , 2002), which contains syntacti- cally annotated speech of Great Britain. The speakers are grouped into geographical re- gions based on place of birth. This is ...