Show simple item record

dc.contributor.authorOmer Abdulrahman, Roshna
dc.contributor.authorHassani, Hossein
dc.contributor.authorAhmadi, Sina
dc.identifier.citationOmer Abdulrahman, Roshna, Hassani, Hossein, & Ahmadi, Sina. (2019). Creating a Fine-Grained Corpus for a Less-resourced Language: the case of Kurdish. Paper presented at the WiNLP 2019 workshop (ACL 2019), Florence, Italy, 28 July.
dc.description.abstractKurdish is a less-resourced language consisting of different dialects written in various scripts. Approximately 30 million people in different countries speak the language. The lack of corpora is one of the main obstacles in Kurdish language processing. In this paper, we present KTC the Kurdish Textbooks Corpus, which is composed of 31 K-12 textbooks in Sorani dialect. The corpus is normalized and categorized into 12 educational subjects containing 693,800 tokens (110,297 types). Our resource is publicly available for non-commercial use under the CC BY-NC-SA 4.0 license.en_IE
dc.description.sponsorshipWe would like to appreciate the generous assistance of the Ministry of Education of the Kurdistan Region of Iraq, particularly the General Directorate of Curriculum and Printing, for providing us with the data for the KTC corpus. Our special gratitude goes to Ms. Namam Jalal Rasheed and Mr. Kawa Omer Muhammad for their assistance in making the required data available and resolving of the copyright issues.en_IE
dc.publisherNUI Galwayen_IE
dc.relation.ispartofWiNLP 2019 workshop (ACL 2019)en
dc.subjectFine-Grained Corpusen_IE
dc.subjectKurdish language processingen_IE
dc.titleCreating a fine-grained corpus for a less-resourced language: the case of Kurdishen_IE
dc.typeWorkshop paperen_IE
dc.local.contactSina Ahmadi, The Insight Centre For Data Analytics, National University Of Ireland, Galway , The Deri Building . Email:

Files in this item

Attribution-NonCommercial-NoDerivs 3.0 Ireland
This item is available under the Attribution-NonCommercial-NoDerivs 3.0 Ireland. No item may be reproduced for commercial purposes. Please refer to the publisher's URL where this is made available, or to notes contained in the item itself. Other terms may apply.

The following license files are associated with this item:


This item appears in the following Collection(s)

Show simple item record