Paper: Automatic Editing in a Back-End Speech-to-Text System

ACL ID P08-1014
Title Automatic Editing in a Back-End Speech-to-Text System
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2008

Written documents created through dictation differ significantly from a true verbatim tran- script of the recorded speech. This poses an obstacle in automatic dictation systems as speech recognition output needs to undergo a fair amount of editing in order to turn it into a document that complies with the cus- tomary standards. We present an approach that attempts to perform this edit from recog- nized words to final document automatically by learning the appropriate transformations from example documents. This addresses a number of problems in an integrated way, which have so far been studied independently, in particular automatic punctuation, text seg- mentation, error correction and disfluency re- pair. We study two different learning methods, one based on rule induction and one based o...