Paper: Tagging And Alignment Of Parallel Texts: Current Status Of BCP

ACL ID A92-1031
Title Tagging And Alignment Of Parallel Texts: Current Status Of BCP
Venue Applied Natural Language Processing Conference
Session Main Conference
Year 1992
Authors

*Many thanks to Graham Russell for his invaluable advice on this abstract. tISSCO, 54 route des Acacias, Gen~ve 1227, Switzerland "*Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University, Malostrauskd n~n~st~ 25, 118 00 Praha 1, Czechoslovakia 3 bcpmark: The Pre-Processor bcpmark is the first step in preparing text for the alignment program. It marks paragraph and sentence boundaries, numbers, words, and punctuation, with the output in SGML notation, bepmark is easily customized to suit a particular text type or language via a userdefined data file. Extensions and alterations to the data are accordingly simple. There are accompanying tools to check number standardization results and sentence boundary marking. Languages currently supported are Fr...