Now showing items 1-20 of 50

    • Asistent -- a machine translation system for Slovene, Serbian and Croatian 

      Arcan, Mihael; Popovic, Maja; Buitelaar, Paul (University of Ljubljana, 2016-09-29)
      The META-NET research on language technologies in 2012 showed a weak support on tools for crossing the language barrier for many European languages, including the south Slavic languages. Therefore, we describe a statistical ...
    • Automatic enrichment of terminological resources: the IATE RDF example 

      Arcan, Mihael; Montiel-Ponsoda, Elena; McCrae, John P.; Buitelaar, Paul (European Language Resources Association, 2018-05-07)
      Terminological resources have proven necessary in many organizations and institutions to ensure communication between experts. However, the maintenance of these resources is a very time-consuming and expensive process. ...
    • Avtomatsko pridobivanje besednih zvez iz korpusa z uporabo leksikona SSJ 

      Arhar Holdt, Špela; Arcan, Mihael (Centre for Slovene as a Second and Foreign Language, Univerity of Ljubljana, 2011-11-17)
      The field of computational lexicography is an interdisciplinary field, primarily focusing on the automatisation of lexicographic procedures and the building of lexical databases of various kinds. In this paper we describe ...
    • A comparison of statistical and neural machine translation for Slovene, Serbian and Croatian 

      Arcan, Mihael (Language Technologies and Digital Humanities 2018, 2018-09-20)
      In this paper we present a comparison of translation quality using of Statistical Machine Translation (SMT) and Neural Machine Translation (NMT), considering translation directions between English, Slovene, Serbian and ...
    • CURED4NLG: A dataset for table-to-text generation 

      Pasricha, Nivranshu; Arcan, Mihael; Buitelaar, Paul (University of Galway, 2023)
      We introduce CURED4NLG, a dataset for the task of table-to-text generation focusing on the public health domain. The dataset consists of 280 pairs of tables and documents extracted from weekly epidemiological reports ...
    • A dataset for troll classification of Tamil memes 

      Chakravarthi, Bharathi Raja; Varma, Pranav; Arcan, Mihael; McCrae, John P.; Buitelaar, Paul; Shardul, Suryawanshi (European Language Resources Association (ELRA), 2020-05-11)
      Social media are interactive platforms that facilitate the creation or sharing of information, ideas or other forms of expression among people. This exchange is not free from offensive, trolling or malicious contents ...
    • Enhancing multiple-choice question answering with causal knowledge 

      Dalal, Dhairya; Arcan, Mihael; Buitelaar, Paul (Association for Computational Linguistics, 2021-06-10)
      The task of causal question answering aims to reason about causes and effects over a provided real or hypothetical premise. Recent approaches have converged on using transformer-based language models to solve question ...
    • Enhancing statistical machine translation with bilingual terminology in a CAT environment 

      Arcan, Mihael; Turchi, Marco; Tonelli, Sara; Buitelaar, Paul (Association for Machine Translation in the Americas, 2014-10-22)
      In this paper, we address the problem of extracting and integrating bilingual terminology into a Statistical Machine Translation (SMT) system for a Computer Aided Translation (CAT) tool scenario. We develop a framework ...
    • The ESSOT system goes wild: an easy way for translating ontologies 

      Arcan, Mihael; Dragoni, Mauro; Buitelaar, Paul (CEUR-WS.org, 2016-10-17)
      To enable knowledge access across languages, ontologies that are often represented only in English, need to be translated into different languages. This activity is time consuming, therefore, smart solutions are required ...
    • ESSOT: an expert supporting system for ontology translation 

      Arcan, Mihael; Dragoni, Mauro; Buitelaar, Paul (Springer Verlag, 2016)
      To enable knowledge access across languages, ontologies, mostly represented only in English, need to be translated into different languages. The main challenge in translating ontologies with machine translation is to ...
    • Expanding wordnets to new languages with multilingual sense disambiguation 

      Arcan, Mihael; McCrae, John P.; Buitelaar, Paul (The COLING 2016 Organizing Committee, 2016-12-11)
      Princeton WordNet is one of the most important resources for natural language processing, but is only available for English. While it has been translated using the expand approach to many other languages, this is an ...
    • Experiments with term translation 

      Arcan, Mihael; Federmann, Christian; Buitelaar, Paul (Association for Computational Linguistics, 2012-12-08)
      In this article we investigate the translation of financial terms from English into German in the isolation of an ontology vocabulary. For this study we automatically built new domain-specific resources from the translation ...
    • First insights on a passive major depressive disorder prediction system with incorporated conversational chatbot 

      Delahunty, Fionn; Wood, Ian D.; Arcan, Mihael (AICS 2018 and CEUR-WS.org, 2018-12-06)
      Almost 50% of cases of major depressive disorder go undiagnosed. In this paper, we propose a passive diagnostic system that combines the areas of clinical psychology, machine learning and conversational dialogue systems. ...
    • Generating linked-data based domain-specific sentiment lexicons from legacy language and semantic resources 

      Vulcu, Gabriela; Buitelaar, Paul; Negi, Sapna; Pereira, Bianca; Arcan, Mihael; Coughlan, Barry; Sanchez, J. Fernando; Iglesias, Carlos A. (European Language Resources Association, 2014)
      We present a methodology for legacy language resource adaptation that generates domain-specific sentiment lexicons organized around domain entities described with lexical information and sentiment words described in the ...
    • Identification of bilingual terms from monolingual documents for statistical machine translation 

      Arcan, Mihael; Giuliano, Claudio; Turchi, Marco; Buitelaar, Paul (Association for Computational Linguistics, 2014-08-23)
      The automatic translation of domain-specific documents is often a hard task for generic Statistical Machine Translation (SMT) systems, which are not able to correctly translate the large number of terms encountered in the ...
    • Identifying main obstacles for statistical machine translation of morphologically rich South Slavic languages 

      Popovic, Maja; Arcan, Mihael (European Association for Machine Translation, 2015-05-11)
      The best way to improve a statistical machine translation system is to identify concrete problems causing translation errors and address them. Many of these problems are related to the characteristics of the involved ...
    • Improving wordnets for under-resourced languages using machine translation 

      Chakravarthi, Bharathi Raja; Arcan, Mihael; McCrae, John P. (Global Wordnet Association, 2018-01-08)
      Wordnets are extensively used in natural language processing, but the current approaches for manually building a wordnet from scratch involves large research groups for a long period of time, which are typically not ...
    • Inferring translation candidates for multilingual dictionary generation with multi-way neural machine translation 

      Arcan, Mihael; Torregrosa, Daniel; Ahmadi, Sina; McCrae, John P. (National University of Ireland, Galway, 2019-05-20)
      In the widely-connected digital world, multilingual lexical resources are one of the most important resources, for natural language processing applications, including information retrieval, question answering or knowledge ...
    • Instance selection for online automatic post-editing in a multi-domain scenario 

      Chatterjee, Rajen; Arcan, Mihael; Negri, Matteo; Turchi, Marco (Association for Machine Translation in the Americas, 2016)
      In recent years, several end-to-end online translation systems have been proposed to successfully incorporate human post-editing feedback in the translation workflow. The performance of these systems in a multi-domain ...
    • Integrating structured and unstructured knowledge sources for domain-specific chatbots 

      Sarkar, Rajdeep (NUI Galway, 2023-11-07)
      The increasing demand for customer support in various industries and the popularity of con versational interfaces has necessitated the development of chatbots. The availability of domain knowledge has enabled chatbots to ...