Paper: Part-of-Speech Tagging for Middle English through Alignment and Projection of Parallel Diachronic Texts

ACL ID D07-1041
Title Part-of-Speech Tagging for Middle English through Alignment and Projection of Parallel Diachronic Texts
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2007
Authors

We demonstrate an approach for inducing a tagger for historical languages based on ex- isting resources for their modern varieties. Tags from Present Day English source text are projected to Middle English text using alignments on parallel Biblical text. We explore the use of multiple alignment ap- proaches and a bigram tagger to reduce the noise in the projected tags. Finally, we train a maximum entropy tagger on the output of the bigram tagger on the target Biblical text and test it on tagged Middle English text. This leads to tagging accuracy in the low 80’s on Biblical test material and in the 60’s on other Middle English material. Our re- sults suggest that our bootstrapping meth- ods have considerable potential, and could be used to semi-automate an approach based on incremental ...