Paper: An Algorithm For Finding Noun Phrase Correspondences In Bilingual Corpora

ACL ID P93-1003
Title An Algorithm For Finding Noun Phrase Correspondences In Bilingual Corpora
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 1993
Authors

The paper describes an algorithm that employs English and French text taggers to associate noun phrases in an aligned bilingual corpus. The tag- gets provide part-of-speech categories which are used by finite-state recognizers to extract simple noun phrases for both languages. Noun phrases are then mapped to each other using an iterative re-estimation algorithm that bears similarities to the Baum-Welch algorithm which is used for train- ing the taggers. The algorithm provides an alter- native to other approaches for finding word cor- respondences, with the advantage that linguistic structure is incorporated. Improvements to the basic algorithm are described, which enable con- text to be accounted for when constructing the noun phrase mappings.