ACL Anthology Network (All About NLP) (beta) The Association Of Computational Linguistics Anthology Network |
ACL ID | C98-1047 |
---|---|
Title | Learning a syntagmatic and paradigmatic structure from language data with a bi-multigram model |
Venue | International Conference on Computational Linguistics |
Session | Main Conference |
Year | 1998 |
Authors |
|
In this paper, we present a stochastic language mod- eling tool which aims at retrieving variable-length phrases (multigrams), assuming bigram dependen- cies between them. The phrase retrieval can be in- termixed with a phrase clustering procedure, so that the language data are iteratively structured at both a paradigmatic and a syntagmatic level in a fully in- tegrated way. Perplexity results on ATR travel ar- rangement data with a bi-multigram model (assum- ing bigram correlations between the phrases) come very close to the trigram scores with a reduced num- ber of entries in the language model. Also the ability of the class version of the model to merge semanti- cally related phrases into a common class is illus- trated.