Paper: Using Three Way Data for Word Sense Discrimination

ACL ID C08-1117
Title Using Three Way Data for Word Sense Discrimination
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2008

In this paper, an extension of a dimen- sionality reduction algorithm called NON- NEGATIVE MATRIX FACTORIZATION is presented that combines both ‘bag of words’ data and syntactic data, in order to find semantic dimensions according to which both words and syntactic relations can be classified. The use of three way data allows one to determine which dimen- sion(s) are responsible for a certain sense of a word, and adapt the corresponding feature vector accordingly, ‘subtracting’ one sense to discover another one. The intuition in this is that the syntactic fea- tures of the syntax-based approach can be disambiguated by the semantic dimensions found by the bag of words approach. The novel approach is embedded into cluster- ing algorithms, to make it fully automatic. The approach is ca...