Paper: Aligning Words Using Matrix Factorisation

ACL ID P04-1064
Title Aligning Words Using Matrix Factorisation
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2004

Aligning words from sentences which are mutual translations is an important problem in different set- tings, such as bilingual terminology extraction, Ma- chine Translation, or projection of linguistic fea- tures. Here, we view word alignment as matrix fac- torisation. In order to produce proper alignments, we show that factors must satisfy a number of con- straints such as orthogonality. We then propose an algorithm for orthogonal non-negative matrix fac- torisation, based on a probabilistic model of the alignment data, and apply it to word alignment. This is illustrated on a French-English alignment task from the Hansard.