ACL Anthology Network (All About NLP) (beta) The Association Of Computational Linguistics Anthology Network |
ACL ID | D11-1144 |
---|---|
Title | Bootstrapped Named Entity Recognition for Product Attribute Extraction |
Venue | Conference on Empirical Methods in Natural Language Processing |
Session | Main Conference |
Year | 2011 |
Authors |
We present a named entity recognition (NER) system for extracting product attributes and values from listing titles. Information extrac- tion from short listing titles present a unique challenge, with the lack of informative con- text and grammatical structure. In this work, we combine supervised NER with bootstrap- ping to expand the seed list, and output nor- malized results. Focusing on listings from eBay’s clothing and shoes categories, our bootstrapped NER system is able to identify new brands corresponding to spelling variants and typographical errors of the known brands, as well as identifying novel brands. Among the top 300 new brands predicted, our system achieves 90.33% precision. To output normal- ized attribute values, we explore several string comparison algorithms and found...