Paper: Comma Restoration Using Constituency Information

ACL ID N03-1029
Title Comma Restoration Using Constituency Information
Venue Human Language Technologies
Session Main Conference
Year 2003

Automatic restoration of punctuation from un- punctuated text has application in improving the fluency and applicability of speech recog- nition systems. We explore the possibility that syntactic information can be used to improve the performance of an HMM-based system for restoring punctuation (specifically, commas) in text. Our best methods reduce sentence error rate substantially — by some 20%, with an ad- ditional 8% reduction possible given improve- ments in extraction of the requisite syntactic in- formation. 1 Motivation The move from isolated word to connected speech recog- nition engendered a qualitative improvement in the nat- uralness of users’ interactions with speech transcription systems, sufficient even to make up in user satisfaction for some modest increase in error ra...