Paper: Dimensionality Reduction for Text using Domain Knowledge

ACL ID C10-2092
Title Dimensionality Reduction for Text using Domain Knowledge
Venue International Conference on Computational Linguistics
Session Poster Session
Year 2010
Authors

Text documents are complex high dimen- sional objects. To effectively visualize such data it is important to reduce its di- mensionality and visualize the low dimen- sional embedding as a 2-D or 3-D scatter plot. In this paper we explore dimension- ality reduction methods that draw upon domain knowledge in order to achieve a better low dimensional embedding and vi- sualization of documents. We consider the use of geometries specified manually by an expert, geometries derived automat- ically from corpus statistics, and geome- tries computed from linguistic resources.