Paper: Temporal Classification Of Text And Automatic Document Dating

ACL ID N06-2008
Title Temporal Classification Of Text And Automatic Document Dating
Venue Human Language Technologies
Session Short Paper
Year 2006
Authors

Temporal information is presently under- utilised for document and text processing purposes. This work presents an unsuper- vised method of extracting periodicity in- formation from text, enabling time series creation and filtering to be used in the creation of sophisticated language models that can discern between repetitive trends and non-repetitive writing pat-terns. The algorithm performs in O(n log n) time for input of length n. The temporal language model is used to create rules based on temporal-word associations inferred from the time series. The rules are used to automatically guess at likely document creation dates, based on the assumption that natural languages have unique signa- tures of changing word distributions over time. Experimental results on news items spanning a nine y...