Recent Submissions

  • Extending largeRDFBench for multi-source data at scale for SPARQL endpoint federation 

    Hasnain, Ali; Saleem, Muhammad; Ngomo, Axel-Cyrille Ngonga; Rebholz-Schuhmann, Dietrich (IOS Press, 2018)
    Querying the Web of Data is highly motivated by the use of federation approaches mainly SPARQL query federation when the data is available through endpoints. Different benchmarks have been proposed to exploit the full ...
  • Avtomatsko pridobivanje besednih zvez iz korpusa z uporabo leksikona SSJ 

    Arhar Holdt, Špela; Arcan, Mihael (Centre for Slovene as a Second and Foreign Language, Univerity of Ljubljana, 2011-11-17)
    The field of computational lexicography is an interdisciplinary field, primarily focusing on the automatisation of lexicographic procedures and the building of lexical databases of various kinds. In this paper we describe ...
  • Deep convolution neural network model to predict relapse in breast cancer 

    Jha, Alokkumar; Verma, Ghanshyam; Khan, Yasar; Mehmood, Qaiser; Rebholz-Schuhmann, Dietrich; Sahay, Ratnesh (IEEE, 2018-12-17)
    A mishap in anti-cancer drug distribution is critical in breast cancer patients due to poor prediction model to identify the treatment regime in ER+ve and ER-ve (Estrogen Receptor (ER)) patients. The traditional method for ...
  • Linked data cased multi-omics integration and visualization for cancer decision networks 

    Jha, Alokkumar; Khan, Yasar; Mehmood, Qaiser; Rebholz-Schuhmann, Dietrich; Sahay, Ratnesh (Springer Verlag, 2018-12-30)
    Visualization of Gene Expression (GE) is a challenging task since the number of genes and their associations are difficult to predict in various set of biological studies. GE could be used to understand tissue-gene-protein ...
  • Engineering an aligned gold-standard corpus of human to machine oriented Controlled Natural Language 

    Hazem Safwat; Brian Davis; Manel Zarrouk (IEEE, 2018-12-03)
    Knowledge base creation and population are an essential formal backbone for a variety of intelligent applications, decision support and expert systems and intelligent search. While the abundance of unstructured text helps ...
  • SemR-11: a multi-lingual gold-standard for semantic similarity and relatedness for eleven languages 

    Barzegar, Siamak; Davis, Brian; Zarrouk, Manel; Handschuh, Siegfried; Freitas, André (European Language Resources Association, 2018-05-07)
    This work describes SemR-11, a multi-lingual dataset for evaluating semantic similarity and relatedness for 11 languages (German, French, Russian, Italian, Dutch, Chinese, Portuguese, Swedish, Spanish, Arabic and Persian). ...
  • WWW'18 open challenge: financial opinion mining and question answering 

    Maia, Macedo; Handschuh, Siegfried; Freitas, André; Davis, Brian; McDermott, Ross; Zarrouk, Manel; Balahur, Alexandra (Association for Computing Machinery, 2018-04-23)
    The growing maturity of Natural Language Processing (NLP) techniques and resources is dramatically changing the landscape of many application domains which are dependent on the analysis of unstructured data at scale. The ...
  • The SSIX corpora: three gold standard corpora for sentiment analysis in English, Spanish and German financial microblogs 

    Gaillat, Thomas; Zarrouk, Manel; Freitas, André; Davis, Brian (European Language Resources Association, 2018-05-07)
    This paper introduces the three SSIX corpora for sentiment analysis. These corpora address the need to provide annotated data for supervised learning methods. They focus on stock-market related messages extracted from two ...
  • FinSentiA: sentiment analysis in English financial microblogs 

    Gaillat, Thomas; Sousa, Annanda; Zarrouk, Manel; Davis, Brian (NUI Galway, 2018-05-14)
    The objective of this paper is to report on the building of a Sentiment Analysis (SA) system dedicated to financial microblogs in English. The purpose of our work is to build a financial classifier that predicts the sentiment ...
  • From simplified text to knowledge representation using controlled natural language 

    Safwat, Hazem; Zarrouk, Manel; Davis, Brian (2017-04-17)
    Knowledge based systems provide means to store data and perform reasoning on top of it. Controlled Natural Language (CNL) is considered as an engineered subset of natural language. CNLs aim to abstract the complexity ...
  • Implicit and explicit aspect extraction in financial microblogs 

    Gaillat, Thomas; Stearns, Bernardo; McDermott, Ross; Sridhar, Gopal; Zarrouk, Manel; Davis, Brian (Association for Computational Linguistics, 2018-07-15)
    This paper focuses on aspect extraction which is a sub-task of Aspect-based Sentiment Analysis. The goal is to report an extraction method of financial aspects in microblog messages. Our approach uses a stock-investment ...
  • Rewriting simplified text into a controlled natural language 

    Safwat, Hazem; Zarrouk, Manel; Davis, Brian (IOS Press, 2018-08-27)
    While machine processable Controlled Natural Languages (CNLs) as a natural language interface have proven a popular, unambiguous and user friendly method for non experts to engineer formal knowledge-bases, human-oriented ...
  • Experiments with term translation 

    Arcan, Mihael; Federmann, Christian; Buitelaar, Paul (Association for Computational Linguistics, 2012-12-08)
    In this article we investigate the translation of financial terms from English into German in the isolation of an ontology vocabulary. For this study we automatically built new domain-specific resources from the translation ...
  • Using domain-specific and collaborative resources for term translation 

    Arcan, Mihael; Buitelaar, Paul; Federmann, Christian (Association for Computational Linguistics, 2012-07)
    In this article we investigate the translation of terms from English into German and vice versa in the isolation of an ontology vocabulary. For this study we built new domainspecific resources from the translation ...
  • Translating the FINREP taxonomy using a domain-specific corpus 

    Arcan, Mihael; Thomas, Susan Marie; De Brandt, Derek; Buitelaar, Paul (IAMT and EAMT, 2013-09-02)
    Our research investigates the use of statistical machine translation (SMT) to translate the labels of concepts in an XBRL taxonomy. Often taxonomy concepts are given labels in only one language. To enable knowledge access ...
  • Ontology label translation 

    Arcan, Mihael; Buitelaar, Paul (Association for Computational Linguistics, 2013-06-09)
    Our research investigates the translation of ontology labels, which has applications in multilingual knowledge access. Ontologies are often defined only in one language, mostly English. To enable knowledge access ...
  • Linguistic linked data for sentiment analysis 

    Buitelaar, Paul; Arcan, Mihael; Iglesias, Carlos A.; Sánchez-Rada, Juan Fernando; Strapparava, Carlo (Association for Computational Linguistics, 2013-09-23)
    In this paper we describe the specification of a model for the semantically interoperable representation of language resources for sentiment analysis. The model integrates ‘lemon’, an RDF-based model for the specification ...
  • Enhancing statistical machine translation with bilingual terminology in a CAT environment 

    Arcan, Mihael; Turchi, Marco; Tonelli, Sara; Buitelaar, Paul (Association for Machine Translation in the Americas, 2014-10-22)
    In this paper, we address the problem of extracting and integrating bilingual terminology into a Statistical Machine Translation (SMT) system for a Computer Aided Translation (CAT) tool scenario. We develop a framework ...
  • One size does not fit all: querying web polystores 

    Khan, Yasar; Zimmermann, Antoine; Jha, Alokkumar; Gadepally, Vijay; d'Aquin, Mathieu; Sahay, Ratnesh (IEEE, 2019-01-17)
    Data retrieval systems are facing a paradigm shift due to the proliferation of specialized data storage engines (SQL, NoSQL, Column Stores, MapReduce, Data Stream, and Graph) supported by varied data models (CSV, JSON, ...
  • Generating linked-data based domain-specific sentiment lexicons from legacy language and semantic resources 

    Vulcu, Gabriela; Buitelaar, Paul; Negi, Sapna; Pereira, Bianca; Arcan, Mihael; Coughlan, Barry; Sanchez, J. Fernando; Iglesias, Carlos A. (European Language Resources Association, 2014)
    We present a methodology for legacy language resource adaptation that generates domain-specific sentiment lexicons organized around domain entities described with lexical information and sentiment words described in the ...

View more