Facilitating prediction of adverse drug reactions by using knowledge graphs and multi-label learning models

Muñoz, Emir; Nováček, Vít; Vandenbussche, Pierre-Yves

View/Open

main-compiled-plain.pdf (868.4Kb)

Date

2017-08-18

Author

Muñoz, Emir

Nováček, Vít

Vandenbussche, Pierre-Yves

Metadata

Show full item record

Usage

This item's downloads: 878 (view details)

Recommended Citation

Emir Muñoz, Vít Nováček, Pierre-Yves Vandenbussche; Facilitating prediction of adverse drug reactions by using knowledge graphs and multi-label learning models, Briefings in Bioinformatics, , bbx099, https://doi.org/10.1093/bib/bbx099

Published Version

https://doi.org/10.1093/bib/bbx099

Abstract

Timely identification of adverse drug reactions (ADRs) is highly important in the domains of public health and pharmacology. Early discovery of potential ADRs can limit their effect on patient lives and also make drug development pipelines more robust and efficient. Reliable in silico prediction of ADRs can be helpful in this context, and thus, it has been intensely studied. Recent works achieved promising results using machine learning. The presented work focuses on machine learning methods that use drug profiles for making predictions and use features from multiple data sources. We argue that despite promising results, existing works have limitations, especially regarding flexibility in experimenting with different data sets and/or predictive models. We suggest to address these limitations by generalization of the key principles used by the state of the art. Namely, we explore effects of: (1) using knowledge graphs machine-readable interlinked representations of biomedical knowledge as a convenient uniform representation of heterogeneous data; and (2) casting ADR prediction as a multi-label ranking problem. We present a specific way of using knowledge graphs to generate different feature sets and demonstrate favourable performance of selected off-the-shelf multi-label learning models in comparison with existing works. Our experiments suggest better suitability of certain multi-label learning methods for applications where ranking is preferred. The presented approach can be easily extended to other feature sources or machine learning methods, making it flexible for experiments tuned toward specific requirements of end users. Our work also provides a clearly defined and reproducible baseline for any future related experiments.

URI

http://hdl.handle.net/10379/6749

Collections

Data Science Institute (Scholarly Articles)

Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Ireland