Paper: Enhancing Multilingual Latent Semantic Analysis with Term Alignment Information

ACL ID C08-1007
Title Enhancing Multilingual Latent Semantic Analysis with Term Alignment Information
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2008
Authors

Latent Semantic Analysis (LSA) is based on the Singular Value Decompo- sition (SVD) of a term-by-document matrix for identifying relationships among terms and documents from co- occurrence patterns. Among the multi- ple ways of computing the SVD of a rectangular matrix X, one approach is to compute the eigenvalue decomposition (EVD) of a square 2 × 2 composite ma- trix consisting of four blocks with X and X T in the off-diagonal blocks and zero matrices in the diagonal blocks. We point out that significant value can be added to LSA by filling in some of the values in the diagonal blocks (corre- sponding to explicit term-to-term or document-to-document associations) and computing a term-by-concept ma- trix from the EVD. For the case of mul- tilingual LSA, we incorporate inf...