Now showing items 1-18 of 18

    • Automatic enrichment of terminological resources: the IATE RDF example 

      Arcan, Mihael; Montiel-Ponsoda, Elena; McCrae, John P.; Buitelaar, Paul (European Language Resources Association, 2018-05-07)
      Terminological resources have proven necessary in many organizations and institutions to ensure communication between experts. However, the maintenance of these resources is a very time-consuming and expensive process. ...
    • A comparison of emotion annotation schemes and a new annotated data set 

      Wood, Ian D.; McCrae, John P.; Andryushechkin, Vladimir; Buitelaar, Paul (European Languages Resources Association (ELRA), 2018-05-07)
      While the recognition of positive/negative sentiment in text is an established task with many standard data sets and well developed methodologies, the recognition of more nuanced affect has received less attention, and ...
    • Cross-lingual sentence embedding using multi-task learning 

      Goswami, Koustava; Dutta, Sourav; Assem, Haytham; Fransen, Theodorus; McCrae, John P. (Association for Computational Linguistics, 2021-11-07)
      Multilingual sentence embeddings capture rich semantic information not only for measuring similarity between texts but also for catering to a broad range of downstream cross-lingual NLP tasks. State-of-the-art multilingual ...
    • The ELEXIS interface for interoperable lexical resources 

      McCrae, John P.; Tiberius, Carole; Khan, Anas Fahad; Kernerman, Ilan; Declerck, Thierry; Krek, Simon; Monachini, Monica; Ahmadi, Sina (eLex 2019, 2019-10-01)
      ELEXIS is a project that aims to create a European network of lexical resources, and one of the key challenges for this is the development of an interoperable interface for different lexical resources so that further ...
    • Expanding wordnets to new languages with multilingual sense disambiguation 

      Arcan, Mihael; McCrae, John P.; Buitelaar, Paul (The COLING 2016 Organizing Committee, 2016-12-11)
      Princeton WordNet is one of the most important resources for natural language processing, but is only available for English. While it has been translated using the expand approach to many other languages, this is an ...
    • Improving wordnets for under-resourced languages using machine translation 

      Chakravarthi, Bharathi Raja; Arcan, Mihael; McCrae, John P. (Global Wordnet Association, 2018-01-08)
      Wordnets are extensively used in natural language processing, but the current approaches for manually building a wordnet from scratch involves large research groups for a long period of time, which are typically not ...
    • Inferring translation candidates for multilingual dictionary generation with multi-way neural machine translation 

      Arcan, Mihael; Torregrosa, Daniel; Ahmadi, Sina; McCrae, John P. (National University of Ireland, Galway, 2019-05-20)
      In the widely-connected digital world, multilingual lexical resources are one of the most important resources, for natural language processing applications, including information retrieval, question answering or knowledge ...
    • Linking knowledge graphs across languages with semantic similarity and machine translation 

      McCrae, John P.; Arcan, Mihael; Buitelaar, Paul (MLP 2017, 2017-09-04)
      Knowledge graphs and ontologies underpin many natural language processing applications, and to apply these to new languages, these knowledge graphs must be translated. Up until now, this has been achieved either by direct ...
    • A multilingual evaluation dataset for monolingual word sense alignment 

      Ahmadi, Sina; McCrae, John P.; Nimb, Sanni; Khan, Fahad; Monachini, Monica; Pedersen, Bolette S.; Declerck, Thierry; Wissik, Tanja; Bellandi, Andrea; Pisani, Irene; Troelsgård, Thomas; Olsen, Sussi; Krek, Simon; Lipp, Veronika; Váradi, Tamás; Simon, László; Gyorffy, Andras; Tiberius, Carole; Schoonheim, Tanneke; Moshe, Yifat Ben; Rudich, Maya; Ahmad, Raya Abu; Lonke, Dorielle; Kovalenko, Kira; Langemets, Margit; Kallas, Jelena; Oksana, Dereza; Fransen, Theodorus; Cillessen, David; Lindemann, David; Alonso, Mikel; Salgado, Ana; Sancho, Jose Luis; Urena-Ruiz, Rafael-J.; Zamorano, Jordi Porta; Simov, Kiril; Osenova, Petya; Kancheva, Zara; Radev, Ivaylo; Stankovic, Ranka; Perdih, Andrej; Gabrovsek, Dejan (National University of Ireland Galway, 2020-05-16)
      Aligning senses across resources and languages is a challenging task with beneficial applications in the field of natural language processing and electronic lexicography. In this paper, we describe our efforts in manually ...
    • NUIG at TIAD 2021: Cross-lingual word embeddings for translation inference 

      Ahmadi, Sina; Ojha, Atul Kr.; Banerjee, Shubhanker; McCrae, John P. (2021-09-01)
      Inducing new translation pairs across dictionaries is an important task that facilitates processing and maintaining lexicographical data. This paper describes our submissions to the Translation Inference Across Dictionaries ...
    • NUIG-Panlingua-KMI Hindi-Marathi MT Systems for Similar Language Translation Task @ WMT 2020 

      Ojha, Atul Kr.; Rani, Priya; Bansal, Akanksha; Chakravarthi, Bharathi Raja; Kumar, Ritesh; McCrae, John P. (Association for Computational Linguistics, 2020-11-19)
      NUIG-Panlingua-KMI submission to WMT 2020 seeks to push the state-of-the-art in the Similar language translation task for the Hindi ↔ Marathi language pair. As part of these efforts, we conducted a series of experiments to ...
    • A survey of current datasets for code-switching research 

      Jose, Navya; Chakravarthi, Bharathi Raja; Suryawanshi, Shardul; Sherly, Elizabeth; McCrae, John P. (IEEE, 2020-03-06)
      Code switching is a prevalent phenomenon in the multilingual community and social media interaction. In the past ten years, we have witnessed an explosion of code switched data in the social media that brings together ...
    • Taxonomy extraction for customer service knowledge base construction 

      Pereira, Bianca; Robin, Cécile; Daudert, Tobias; McCrae, John P.; Mohanty, Pranab; Buitelaar, Paul (Springer, 2019-11-04)
      Customer service agents play an important role in bridging the gap between customers vocabulary and business terms. In a scenario where organisations are moving into semi-automatic customer service, se- mantic technologies ...
    • TIAD 2019 Shared Task: Leveraging knowledge graphs with neural machine translation for automatic multilingual dictionary generation 

      Torregrosa, Daniel; Arcan, Mihael; Ahmadi, Sina; McCrae, John P. (National University of Ireland, Galway, 2019-04-20)
      This paper describes the different proposed approaches to the TIAD 2019 Shared Task, which consisted in the automatic discovery and generation of dictionaries leveraging multilingual knowledge bases. We present three methods ...
    • Towards a crowd-sourced WordNet for colloquial English 

      McCrae, John P.; Wood, Ian D.; Hicks, Amanda (The Global WordNet Association, 2018-01-08)
      Princeton WordNet is one of the most widely-used resources for natural language processing, but is updated only infrequently and cannot keep up with the fast-changing usage of the English language on social media ...
    • Towards automatic linking of lexicographic data: the case of a historical and a modern Danish dictionary 

      Ahmadi, Sina; Nimb, Sanni; McCrae, John P.; Sørensen, Nicolai H. (European Association for Lexicography, 2020)
      Given the diversity of lexical-semantic resources, particularly dictionaries, integrating such resources by aligning various types of information is an important task, both in e-lexicography and natural language processing. ...
    • Towards electronic lexicography for the Kurdish language 

      Ahmadi, Sina; Hassani, Hossein; McCrae, John P. (eLex 2019, 2019-10-01)
      This paper describes the development of lexicographic resources for Kurdish and provides a lexical model for this language. Kurdish is considered a less-resourced language, and currently, lacks machine-readable lexical ...
    • Unsupervised deep language and dialect identification for short texts 

      Goswami, Koustava; Sarkar, Rajdeep; Chakravarthi, Bharathi Raja; Fransen, Theodorus; McCrae, John P. (International Committee on Computational Linguistics, 2020-12)
      Automatic Language Identification (LI) or Dialect Identification (DI) of short texts of closely related languages or dialects, is one of the primary steps in many natural language processing pipelines. Language identification ...