Paper: Tunisian dialect Wordnet creation and enrichment using web resources and other Wordnets

ACL ID W14-3613
Title Tunisian dialect Wordnet creation and enrichment using web resources and other Wordnets
Venue Workshop on Arabic Natural Language Processing
Session
Year 2014
Authors

In this paper, we propose TunDiaWN (Tunisian dialect Wordnet) a lexical re- source for the dialect language spoken in Tunisia. Our TunDiaWN construction approach is founded, in one hand, on a corpus based method to analyze and ex- tract Tunisian dialect words. A clustering technique is adapted and applied to mine the possible relations existing between the Tunisian dialect extracted words and to group them into meaningful groups. All these suggestions are then evaluated and validated by the experts to perform the resource enrichment task. We reuse other Wordnet versions, mainly for Eng- lish and Arabic language to propose a new database structure enriched by inno- vative features and entities.