Paper: The Manually Annotated Sub-Corpus: A Community Resource for and by the People

ACL ID P10-2013
Title The Manually Annotated Sub-Corpus: A Community Resource for and by the People
Venue Annual Meeting of the Association of Computational Linguistics
Session Short Paper
Year 2010
Authors

The Manually Annotated Sub-Corpus (MASC) project provides data and annota- tions to serve as the base for a community- wide annotation effort of a subset of the American National Corpus. The MASC infrastructure enables the incorporation of contributed annotations into a single, us- able format that can then be analyzed as it is or ported to any of a variety of other formats. MASC includes data from a much wider variety of genres than exist- ingmultiply-annotatedcorporaofEnglish, and the project is committed to a fully open model of distribution, without re- striction, for all data and annotations pro- duced or contributed. As such, MASC is the first large-scale, open, community- based effort to create much needed lan- guage resources for NLP. This paper de- scribes the MASC project, its co...