Paper: Creating Custom Taggers by Integrating Web Page Annotation and Machine Learning

ACL ID C14-2004
Title Creating Custom Taggers by Integrating Web Page Annotation and Machine Learning
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2014
Authors

We present an on-going work on a software package that integrates discriminative machine learn- ing with the open source WebAnnotator system of Tannier (2012). The WebAnnotator system allows users to annotate web pages within their browser with custom tag sets. Meanwhile, we integrate the WebAnnotator system with a machine learning package which enables automatic tagging of new web pages. We hope the software evolves into a useful information extraction tool for motivated hobbyists who have domain expertise on their task of interest but lack machine learning or programming knowledge. This paper presents the system architecture, including the WebAnnotator-based front-end and the machine learning component. The system is available under an open source license.