ACL Anthology Network (All About NLP) (beta) The Association Of Computational Linguistics Anthology Network |
ACL ID | P07-1029 |
---|---|
Title | SVM Model Tampering and Anchored Learning: A Case Study in Hebrew NP Chunking |
Venue | Annual Meeting of the Association of Computational Linguistics |
Session | Main Conference |
Year | 2007 |
Authors |
|
We study the issue of porting a known NLP method to a language with little existing NLP resources, specifically Hebrew SVM-based chunking. We introduce two SVM-based methods – Model Tampering and Anchored Learning. These allow fine grained analysis of the learned SVM models, which provides guidance to identify errors in the training cor- pus, distinguish the role and interaction of lexical features and eventually construct a model with ∼10% error reduction. The re- sulting chunker is shown to be robust in the presence of noise in the training corpus, relies on less lexical features than was previously understood and achieves an F-measure perfor- mance of 92.2 on automatically PoS-tagged text. The SVM analysis methods also provide general insight on SVM-based chunking.