Source PaperYearLineSentence
N12-1022 2012 206
In order to illustrate why using a single goldstandard reference segmentation can be problematic, we evaluate three publicly available seg menters, MinCutSeg (Malioutov and Barzilay, 2006), BayesSeg (Eisenstein and Barzilay, 2008)and APS (Kazantseva and Szpakowicz, 2011), us ing several different gold standards and then usingall available annotations (self citation)
N12-1022 2012 8
Segmentation may be particularly beneficial when working with documents without overt structure: speech transcripts (Malioutov and Barzilay, 2006), newswire (Misra et al, 2011) or novels (Kazantseva and Szpakowicz, 2011) (self citation)
N13-1019 2013 167
To demonstrate the effectiveness of our model (de noted by TSM) in topic segmentation tasks, we 195 evaluate it on three different kinds of corpora4: a set of synthetic documents, two meeting transcripts and two sets of text books (see Tables 2 and 3); and compare TSM with the following methods: two baselines (the Random algorithm that places topicboundaries uniformly at random, and the Even al gorithm that places a boundary after every mth textpassage, where m is the average gold-standard seg ment length (Beeferman et al1999)), C99, MinCut, Bayesseg, APS (Kazantseva and Szpakowicz, 2011), and PLDA.Metrics: We evaluated the segmentation performance with PK (Beeferman et al1999) and Win dowDiff (WDr) (Pevzner and Hearst, 2002), whichare two common metrics used in topic segmenta tion
N13-1019 2013 37
Work fol lowing this line includes TextTiling (Hearst, 1997), which calculates the cosine similarity between two adjacent blocks of words purely based on the word frequency; C99 (Choi, 2000), an algorithm based on divisive clustering with a matrix-ranking scheme; LSeg (Galley et al2003), which uses a lexical chain to identify and weight word repetitions; U00 (Utiyama and Isahara, 2001), a probalistic approachusing dynamic programming to find a segmenta tion with a minimum cost; MinCut (Malioutov and Barzilay, 2006), which casts segmentation as a graph cut problem, and APS (Kazantseva and Szpakowicz,2011), which uses affinity propagation to learn clus tering for segmentation
N13-1019 2013 215
We further tested TSM on two written text datasets, Clinical (Eisenstein and Barzilay, 2008) and Fiction (Kazantseva and Szpakowicz, 2011)
P13-1167 2013 206
Three automatic segmenters were trained?or had their parameters estimated upon?The Moonstone data set, including MinCut; (Malioutov and Barzilay, 2006), BayesSeg; (Eisenstein and Barzilay, 2008), and APS (Kazantseva and Szpakowicz, 2011)
P13-1167 2013 10
for which a variety of automatic segmenters exist (e.g., Hearst 1997, Malioutov and Barzilay 2006, Eisenstein and Barzilay 2008, and Kazantseva and Szpakowicz 2011)