Paper: Experiments In Parallel-Text Based Grammar Induction

ACL ID P04-1060
Title Experiments In Parallel-Text Based Grammar Induction
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2004
Authors
  • Jonas Kuhn (University of Texas at Austin, Austin TX)

This paper discusses the use of statistical word alignment over multiple parallel texts for the identi- fication of string spans that cannot be constituents in one of the languages. This information is ex- ploited in monolingual PCFG grammar induction for that language, within an augmented version of the inside-outside algorithm. Besides the aligned corpus, no other resources are required. We discuss an implemented system and present experimental results with an evaluation against the Penn Tree- bank.