Paper: Learning a syntagmatic and paradigmatic structure from language data with a bi-multigram model

ACL ID C98-1047
Title Learning a syntagmatic and paradigmatic structure from language data with a bi-multigram model
Venue International Conference on Computational Linguistics
Session Main Conference
Year 1998
Authors

In this paper, we present a stochastic language mod- eling tool which aims at retrieving variable-length phrases (multigrams), assuming bigram dependen- cies between them. The phrase retrieval can be in- termixed with a phrase clustering procedure, so that the language data are iteratively structured at both a paradigmatic and a syntagmatic level in a fully in- tegrated way. Perplexity results on ATR travel ar- rangement data with a bi-multigram model (assum- ing bigram correlations between the phrases) come very close to the trigram scores with a reduced num- ber of entries in the language model. Also the ability of the class version of the model to merge semanti- cally related phrases into a common class is illus- trated.