Paper: Linguistic Regularities in Continuous Space Word Representations

ACL ID N13-1090
Title Linguistic Regularities in Continuous Space Word Representations
Venue Annual Conference of the North American Chapter of the Association for Computational Linguistics
Session Main Conference
Year 2013
Authors

Continuous space language models have re- cently demonstrated outstanding results across a variety of tasks. In this paper, we ex- amine the vector-space word representations that are implicitly learned by the input-layer weights. We find that these representations are surprisingly good at capturing syntactic and semantic regularities in language, and that each relationship is characterized by a relation-specific vector offset. This allows vector-oriented reasoning based on the offsets between words. For example, the male/female relationship is automatically learned, and with the induced vector representations, ?King - Man + Woman? results in a vector very close to ?Queen.? We demonstrate that the word vectors capture syntactic regularities by means of syntactic analogy questions (provided...