Linked functional annotation for differentially expressed gene (DEG) demonstrated using Illumina Body Map 2.0
Date
2015-12-09Author
Jha, Alokkumar
Khan, Yasar
Iqbal, Aftab
Zappa, Achille
Mehdi, Muntazir
Sahay, Ratnesh
Rebholz-Schuhmann, Dietrich
Metadata
Show full item recordUsage
This item's downloads: 169 (view details)
Recommended Citation
Jha, Alokkumar , Khan, Yasar , Mehdi, Muntazir , Iqbal, Aftab , Zappa, Achille , Sahay, Ratnesh , & Rebholz-Schuhmann, Dietrich. (2015). Linked Functional Annotation For Differentially Expressed Gene (DEG) Demonstrated using Illumina Body Map 2.0. Paper presented at the International Conference of Semantic Web Applications and Tools for the Life Sciences (SWAT4LS), Cambridge, United Kingdom, 09 December.
Published Version
Abstract
Semantic Web technologies are core for the integration of
disparate data resources. It can be used to exploit data from next generation
sequencing (NGS) for therapeutic decisions regarding cancer. In
this manuscript, we describe how different data resources, which inform
on the expression of specific genes in a tissue and its variants, can be
brought together to indicate a risk for tissue-specific cancer for NGS
data. This approach can be used to judge patient genomic data against
public reference data resources.
The TCGA and COSMIC repositories are being processed to connect and
query information concerning the expression of genes, copy number variants
(CNV), and somatic mutations. We annotated sets of differential expression
data provided from the Illumina Body map 2.0 (HBM) concerning
16 different tissue types and identify genes with an RPKM (Reads Per
Kilobase of transcript per Million mapped reads) value greater than 0.5
as measure indicating an associated risk for cancer. Thus, the differential
expressed genes from HBM can be associated with a tissue type and gene
expressions in COSMIC and TCGA leading to a potential biomarker for
that particular tissue specific cancer. In the case of ovarian cancer, we
retrieved the genomic positions (loci) and the associated genes of potential
biomarker candidates, and suggest that this approach and platform
can serve future studies well.
Altogether, the presented linked annotation platform is the first approach
to represent the COSMIC data in an RDF format and to link the data
with the TCGA datasets. The proposed approach enriches mutations by
filling in missing links from COSMIC and TCGA datasets which in turn
helped to map mutations with associated phenotypes.