Show simple item record

dc.contributor.authorKhan, Yasar
dc.contributor.authorZimmermann, Antoine
dc.contributor.authorJha, Alokkumar
dc.contributor.authorGadepally, Vijay
dc.contributor.authord'Aquin, Mathieu
dc.contributor.authorSahay, Ratnesh
dc.date.accessioned2019-02-07T11:21:00Z
dc.date.available2019-02-07T11:21:00Z
dc.date.issued2019-01-17
dc.identifier.citationKhan, Yasar, Zimmermann, Antoine, Jha, Alokkumar; Gadepally, Vijay, d'Aquin, Mathieu, Sahay, Ratnesh. (2019). One Size Does Not Fit All: Querying Web Polystores. IEEE Access, 7, 9598-9617. doi: 10.1109/ACCESS.2018.2888601en_IE
dc.identifier.issn2169-3536
dc.identifier.urihttp://hdl.handle.net/10379/14919
dc.description.abstractData retrieval systems are facing a paradigm shift due to the proliferation of specialized data storage engines (SQL, NoSQL, Column Stores, MapReduce, Data Stream, and Graph) supported by varied data models (CSV, JSON, RDB, RDF, and XML). One immediate consequence of this paradigm shift results into data bottleneck over the web; which means, web applications are unable to retrieve data with the intensity at which data are being generated from different facilities. Especially in the genomics and healthcare verticals, data are growing from petascale to exascale, and biomedical stakeholders are expecting seamless retrieval of these data over the web. In this paper, we argue that the bottleneck over the web can be reduced by minimizing the costly data conversion process and delegating query performance and processing loads to the specialized data storage engines over their native data models. We propose a web-based query federation mechanism—called PolyWeb—that unifies query answering over multiple native data models (CSV, RDB, and RDF). We emphasize two main challenges of query federation over native data models: 1) devise a method to select prospective data sources—with different underlying data models—that can satisfy a given query and 2) query optimization, join, and execution over different data models. We demonstrate PolyWeb on a cancer genomics use case, where it is often the case that a description of biological and chemical entities (e.g., gene, disease, drug, and pathways) spans across multiple data models and respective storage engines. In order to assess the benefits and limitations of evaluating queries over native data models, we evaluate PolyWeb with the state-of-the-art query federation engines in terms of result completeness, source selection, and overall query execution time.en_IE
dc.formatapplication/pdfen_IE
dc.language.isoenen_IE
dc.publisherIEEEen_IE
dc.relation.ispartofIeee Accessen
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 Ireland
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/3.0/ie/
dc.subjectDatabasesen_IE
dc.subjectWorld wide weben_IE
dc.subjectQuery federationen_IE
dc.subjectQuery optimizationen_IE
dc.subjectQuery planningen_IE
dc.subjectLinked dataen_IE
dc.subjectSPARQLen_IE
dc.subjectHealthcareen_IE
dc.subjectLife sciencesen_IE
dc.titleOne size does not fit all: querying web polystoresen_IE
dc.typeArticleen_IE
dc.date.updated2019-02-01T14:52:48Z
dc.identifier.doi10.1109/ACCESS.2018.2888601
dc.local.publishedsourcehttps;//dx.doi.org/10.1109/ACCESS.2018.2888601en_IE
dc.description.peer-reviewedpeer-reviewed
dc.internal.rssid15768505
dc.local.contactYasar Khan, Deri, Nui Galway. Email: yasar.khan@nuigalway.ie
dc.local.copyrightcheckedYes
dc.local.versionPUBLISHED
nui.item.downloads478


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 Ireland
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Ireland