Paper: Random Manhattan Integer Indexing: Incremental L1 Normed Vector Space Construction

ACL ID D14-1178
Title Random Manhattan Integer Indexing: Incremental L1 Normed Vector Space Construction
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2014
Authors

Vector space models (VSMs) are math- ematically well-defined frameworks that have been widely used in the distributional approaches to semantics. In VSMs, high- dimensional vectors represent linguistic entities. In an application, the similar- ity of vectors?and thus the entities that they represent?is computed by a distance formula. The high dimensionality of vec- tors, however, is a barrier to the perfor- mance of methods that employ VSMs. Consequently, a dimensionality reduction technique is employed to alleviate this problem. This paper introduces a novel technique called Random Manhattan In- dexing (RMI) for the construction of ` 1 normed VSMs at reduced dimensionality. RMI combines the construction of a VSM and dimension reduction into an incre- mental and thus scalable two-step proc...