Paper: On Aligning Trees

ACL ID W97-0308
Title On Aligning Trees
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 1997

The increasing availability of corpora anno- tated for linguistic structure prompts the question: if we have the same texts, anno- tated for phrase structure under two dif- ferent schemes, to what extent do the an- notations agree on structuring within the text? We suggest the term tree alignment to indicate the situation where two markup schemes choose to bracket off the same text elements. We propose a general method for determining agreement between two anal- yses. We then describe an efficient im- plementation, which is also modular in that the core of the implementation can be reused regardless of the format of markup used in the corpora. The output of the implementation on the Susanne and Penn treebank corpora is discussed.