Search

Now showing items 1-10 of 17

Creating a fine-grained corpus for a less-resourced language: the case of Kurdish

Omer Abdulrahman, Roshna; Hassani, Hossein; Ahmadi, Sina (NUI Galway, 2019-07-28)

Kurdish is a less-resourced language consisting of different dialects written in various scripts. Approximately 30 million people in different countries speak the language. The lack of corpora is one of the main obstacles ...

Creating a multilingual terminological resource using linked data:the case of archaeological domain in the Italian language

Carlino, Carola; Ahmadi, Sina; Speranza, Giulia (CEUR Workshop Proceedings, 2019-11-13)

The lack of multilingual terminological resources in specialized domains constitutes an obstacle to the access and reuse of information. In the technical domain of cultural heritage and, in particular, archaeology, such ...

Inferring translation candidates for multilingual dictionary generation with multi-way neural machine translation

Arcan, Mihael; Torregrosa, Daniel; Ahmadi, Sina; McCrae, John P. (National University of Ireland, Galway, 2019-05-20)

In the widely-connected digital world, multilingual lexical resources are one of the most important resources, for natural language processing applications, including information retrieval, question answering or knowledge ...

NUIG at the FinSBD Task: Sentence boundary detection for noisy financial PDFs in English and French

Daudert, Tobias; Ahmadi, Sina (NUI Galway, 2019-08-12)

Portable Document Format (PDF) has become the industry-standard document as it is independent of the software, hardware or operating system. Publicly listed companies annually publish a variety of reports and too take ...

CoFiF: A corpus of financial reports in French language

Ahmadi, Sina; Daudert, Tobias (NUI Galway, 2019-08-12)

In an era when machine learning and artificial intelligence have huge momentum, the data demand to train and test models is steadily growing. We introduce CoFiF, the first corpus comprising company reports in the French ...

TIAD 2019 Shared Task: Leveraging knowledge graphs with neural machine translation for automatic multilingual dictionary generation

Torregrosa, Daniel; Arcan, Mihael; Ahmadi, Sina; McCrae, John P. (National University of Ireland, Galway, 2019-04-20)

This paper describes the different proposed approaches to the TIAD 2019 Shared Task, which consisted in the automatic discovery and generation of dictionaries leveraging multilingual knowledge bases. We present three methods ...

On lexicographical networks

Ahmadi, Sina; Arcan, Mihael; McCrae, John (NUI Galway, 2018-12-06)

In this study, we analyze various aspects of lexicographical networks. We would like to answer our research questions of what are the characteristics of the lexicographical networks? In addition to the existing notions of ...

Lexical sense alignment using weighted bipartite b-matching

Ahmadi, Sina; Arcan, Mihael; McCrae, John (NUI Galway, 2019-05-20)

In this study, we present a similarity-based approach for lexical sense alignment in WordNet and Wiktionary with a focus on the polysemous items. Our approach relies on semantic textual similarity using features such as ...

A corpus of the Sorani Kurdish folkloric lyrics

Ahmadi, Sina; Hassani, Hossein; Abedi, Kamaladdin (National University of Ireland Galway, 2020-05-16)

Kurdish poetry and prose narratives were historically transmitted orally and less in a written form. Being an essential medium of oral narration and literature, Kurdish lyrics have had a unique attribute in becoming a ...

A multilingual evaluation dataset for monolingual word sense alignment

Ahmadi, Sina; McCrae, John P.; Nimb, Sanni; Khan, Fahad; Monachini, Monica; Pedersen, Bolette S.; Declerck, Thierry; Wissik, Tanja; Bellandi, Andrea; Pisani, Irene; Troelsgård, Thomas; Olsen, Sussi; Krek, Simon; Lipp, Veronika; Váradi, Tamás; Simon, László; Gyorffy, Andras; Tiberius, Carole; Schoonheim, Tanneke; Moshe, Yifat Ben; Rudich, Maya; Ahmad, Raya Abu; Lonke, Dorielle; Kovalenko, Kira; Langemets, Margit; Kallas, Jelena; Oksana, Dereza; Fransen, Theodorus; Cillessen, David; Lindemann, David; Alonso, Mikel; Salgado, Ana; Sancho, Jose Luis; Urena-Ruiz, Rafael-J.; Zamorano, Jordi Porta; Simov, Kiril; Osenova, Petya; Kancheva, Zara; Radev, Ivaylo; Stankovic, Ranka; Perdih, Andrej; Gabrovsek, Dejan (National University of Ireland Galway, 2020-05-16)

Aligning senses across resources and languages is a challenging task with beneficial applications in the field of natural language processing and electronic lexicography. In this paper, we describe our efforts in manually ...