Paper: HadoopPerceptron: a Toolkit for Distributed Perceptron Training and Prediction with MapReduce

ACL ID E12-2020
Title HadoopPerceptron: a Toolkit for Distributed Perceptron Training and Prediction with MapReduce
Venue Annual Meeting of The European Chapter of The Association of Computational Linguistics
Session System Demonstration
Year 2012
Authors

We propose a set of open-source software modules to perform structured Perceptron Training, Prediction and Evaluation within the Hadoop framework. Apache Hadoop is a freely available environment for run- ning distributed applications on a com- puter cluster. The software is designed within the Map-Reduce paradigm. Thanks to distributed computing, the proposed soft- ware reduces substantially execution times while handling huge data-sets. The dis- tributed Perceptron training algorithm pre- serves convergence properties, thus guar- anties same accuracy performances as the serial Perceptron. The presented modules can be executed as stand-alone software or easily extended or integrated in complex systems. The execution of the modules ap- plied to specific NLP tasks can be demon- strated and t...