Paper: Multilingual Lexical Database Generation From Parallel Texts In 20 European Languages With Endogenous Resources

ACL ID P06-2035
Title Multilingual Lexical Database Generation From Parallel Texts In 20 European Languages With Endogenous Resources
Venue Annual Meeting of the Association of Computational Linguistics
Session Poster Session
Year 2006
Authors

This paper deals with multilingual data- base generation from parallel corpora. The idea is to contribute to the enrich- ment of lexical databases for languages with few linguistic resources. Our ap- proach is endogenous: it relies on the raw texts only, it does not require external linguistic resources such as stemmers or taggers. The system produces alignments for the 20 European languages of the ‘Acquis Communautaire’ Corpus.