Paper: Insights from Network Structure for Text Mining

ACL ID P11-1162
Title Insights from Network Structure for Text Mining
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2011

Text mining and data harvesting algorithms have become popular in the computational lin- guistics community. They employ patterns that specify the kind of information to be har- vested, and usually bootstrap either the pat- tern learning or the term harvesting process (or both) in a recursive cycle, using data learned in one step to generate more seeds for the next. They therefore treat the source text corpus as a network, in which words are the nodes and relations linking them are the edges. The re- sults of computational network analysis, espe- cially from the world wide web, are thus ap- plicable. Surprisingly, these results have not yet been broadly introduced into the computa- tional linguistics community. In this paper we show how various results apply to text mining, how they explai...