Paper: Bilingual Active Learning for Relation Classification via Pseudo Parallel Corpora

ACL ID P14-1055
Title Bilingual Active Learning for Relation Classification via Pseudo Parallel Corpora
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2014
Authors

Active learning (AL) has been proven ef- fective to reduce human annotation ef- forts in NLP. However, previous studies on AL are limited to applications in a single language. This paper proposes a bilingual active learning paradigm for re- lation classification, where the unlabeled instances are first jointly chosen in terms of their prediction uncertainty scores in two languages and then manually labeled by an oracle. Instead of using a parallel corpus, labeled and unlabeled instances in one language are translated into ones in the other language and all instances in both languages are then fed into a bilin- gual active learning engine as pseudo parallel corpora. Experimental results on the ACE RDC 2005 Chinese and English corpora show that bilingual active learn- ing for r...