Source PaperYearLineSentence
W06-1614 2006 6
There have been a number of recent studies on probabilistic treebank parsing of German (Dubey, 2005; Dubey and Keller, 2003; Schiehlen, 2004;Schulte im Walde, 2003), using the Negra tree bank (Skut et al, 1997) as their underlying data source
W06-1614 2006 207
We did not investigate the influence of treebank refinement in this study.However, we would like to note that by a com bination of suffix analysis and smoothing, Dubey (2005) was able to obtain an F-score of 85.2 forNegra
W06-1614 2006 103
Previous work, such as Dubey (2005), Dubey and Keller (2003), and Schiehlen (2004), uses the version of Negra in which the standard approach to resolving crossing branches has been applied
W06-1628 2006 132
For each sentence pair from this data, we used a version of the German parser described by Dubey (2005) to parse the German component, and a version of the English parser described by Collins (1999) to parse the Englishcomponent
P06-1041 2006 23
Dubey (2005) reported how serious this problem can be when he coupled a tagger with a subsequent parser, and noted that tagging errors are by far the most important source of parsing errors.As soon as more than two components are in volved, the combination of different error sources migth easily lead to a substantial decrease of the overall quality instead of achieving the desired synergy
P06-1041 2006 153
Here the state of the art for German isdefined by a system which applies treebank trans formations to the original NEGRA treebank andextends a Collins-style parser with a suffix analy sis (Dubey, 2005)
P06-2029 2006 182
Combining treebank transfor mation techniques with a suffix analysis, (Dubey, 2005) trained a probabilistic parser and reached alabelled F-score of 76.3% on phrase structure an notations for a subset of the sentences used here (with a maximum length of 40)
W07-0718 2007 104
We used Cowan and Collins (2005)?s parser for Spanish, Arun andKeller (2005)?s for French, Dubey (2005)?s for Ger man, and Bikel (2002)?s for English
N07-1051 2007 201
410 ? 40 words all Parser LP LR LP LR ENGLISH Charniak et al (2005) 90.1 90.1 89.5 89.6 Petrov et al (2006) 90.3 90.0 89.8 89.6 This Paper 90.7 90.5 90.2 89.9 ENGLISH (reranked) Charniak et al (2005)4 92.4 91.6 91.8 91.0 GERMANDubey (2005) F1 76.3 This Paper 80.8 80.7 80.1 80.1 CHINESE5 Chiang et al (2002) 81.1 78.8 78.0 75.2 This Paper 80.8 80.7 78.8 78.5 Table 4: Our final test set parsing performance compared to the best previous work on English, German and Chinese
D07-1016 2007 75
The parsing experiments were performed with a state-of-the-art parser trained on the TIGER corpuswhich returns both phrase categories and grammati cal functions (Dubey, 2005b) (self citation)
D07-1016 2007 79
Based upon an evaluation on the NEGRA treebank (Skut et al, 1998), using a 90%-5%-5% training-development-test split, the parser performswith an accuracy of 73.1 F-score on labelled brack ets with a coverage of 99.1% (Dubey, 2005b) (self citation)
D07-1016 2007 131
In addition, we also report dependency accuracy (Dep), calculated using theapproach described in Lin (1995), using the headpicking method used by Dubey (2005a) (self citation)
W08-1006 2008 9
Many of these techniques have been investigated in other work (Schiehlen, 2004; Dubey, 2004; Dubey, 2005), but, we hope that by consolidating, replicating, improving, and clarifying previous results we cancontribute to the re-evaluation of German proba bilistic parsing after a somewhat confusing start to initial literature in this area
W08-1005 2008 67
We applied our latent variable model directly to each of the treebanks, without any ? 40 words all Parser LP LR LP LR ENGLISH Charniak et al (2005) 90.1 90.1 89.5 89.6 Petrov and Klein (2007) 90.7 90.5 90.2 89.9 ENGLISH (reranked) Charniak et al (2005) 92.4 91.6 91.8 91.0 GERMAN (NEGRA)Dubey (2005) F1 76.3 Petrov and Klein (2007) 80.8 80.7 80.1 80.1 CHINESE Chiang et al (2002) 81.1 78.8 78.0 75.2 Petrov and Klein (2007) 86.9 85.7 84.8 81.9 Table 1: Our split-and-merge latent variable approach produces the best published parsing performance on many languages
W08-1007 2008 8
Earlier studies by Dubey and Keller (2003) and Dubey (2005) using the Negratreebank (Skut et al, 1997) reports that lexicaliza tion of PCFGs decrease the parsing accuracy when parsing Negra?s flat constituent structures
W08-0309 2008 118
The English, French, German and Spanish test sets were automatically parsed using high quality parsers for those languages (Bikel, 2002; Arun and Keller, 2005; Dubey, 2005; Bick, 2006)
D09-1006 2009 85
To evaluate the candidate transla tion, the source parse tree is first obtained (Dubey,2005), and each subtree is matched with a substring in the candidate string
W10-1401 2010 59
Dubey (2005) showed that, for German parsing, adding case and morphology informationtogether with smoothed markovization and an ade quate unknown-word model is more important than lexicalization (Dubey and Keller, 2003)
P11-2127 2011 92
For German, our parser outperformsDubey (2005) and we are not far behind latent variable parsers, for which parsing is substantially7These statistics can be further improved with standard pars ing micro-optimization.8See Gildea (2001) and Petrov and Klein (2007) for the ex act experimental setup that we followed here