Paper: Language-Independent Bilingual Terminology Extraction from a Multilingual Parallel Corpus

ACL ID E09-1057
Title Language-Independent Bilingual Terminology Extraction from a Multilingual Parallel Corpus
Venue Annual Meeting of The European Chapter of The Association of Computational Linguistics
Session Main Conference
Year 2009
Authors

We present a language-pair independent terminology extraction module that is based on a sub-sentential alignment sys- tem that links linguistically motivated phrases in parallel texts. Statistical filters are applied on the bilingual list of candi- date terms that is extracted from the align- ment output. We compare the performance of both the alignment and terminology extrac- tion module for three different language pairs (French-English, French-Italian and French-Dutch) and highlight language- pair specific problems (e.g. different com- pounding strategy in French and Dutch). Comparisons with standard terminology extraction programs show an improvement of up to 20% for bilingual terminology ex- traction and competitive results (85% to 90% accuracy) for monolingual terminol- ogy extractio...