Paper: Tagging Spoken Language Using Written Language Statistics

ACL ID C96-2192
Title Tagging Spoken Language Using Written Language Statistics
Venue International Conference on Computational Linguistics
Session Main Conference
Year 1996
Authors

This paper reports on two experiments with a probabilistic part-of-speech tag- ger, trained on a tagged corpus of writ- ten Swedish, being used to tag a corpus of (transcribed) spoken Swedish. The re- sults indicate that with very little adap- tations an accuracy rate of 85% can be achieved, with an accuracy rate for known words of 90%. In addition, two different treatments of pauses were ex- plored but with no significant gain in ac- curacy under either condition.