Show simple item record

dc.contributor.authorGeeleher, Paul
dc.contributor.authorHartnett, Lori
dc.contributor.authorEgan, Laurance J.
dc.contributor.authorGolden, Aaron
dc.contributor.authorRaja Ali, Raja Affendi
dc.contributor.authorSeoighe, Cathal
dc.date.accessioned2018-09-20T16:08:54Z
dc.date.available2018-09-20T16:08:54Z
dc.date.issued2013-06-03
dc.identifier.citationGeeleher, Paul; Hartnett, Lori; Egan, Laurance J. Golden, Aaron; Raja Ali, Raja Affendi; Seoighe, Cathal (2013). Gene-set analysis is severely biased when applied to genome-wide methylation data. Bioinformatics 29 (15), 1851-1857
dc.identifier.issn1460-2059,1367-4803
dc.identifier.urihttp://hdl.handle.net/10379/11600
dc.description.abstractMotivation: DNA methylation is an epigenetic mark that can stably repress gene expression. Because of its biological and clinical significance, several methods have been developed to compare genome-wide patterns of methylation between groups of samples. The application of gene set analysis to identify relevant groups of genes that are enriched for differentially methylated genes is often a major component of the analysis of these data. This can be used, for example, to identify processes or pathways that are perturbed in disease development. We show that gene-set analysis, as it is typically applied to genome-wide methylation assays, is severely biased as a result of differences in the numbers of CpG sites associated with different classes of genes and gene promoters. Results: We demonstrate this bias using published data from a study of differential CpG island methylation in lung cancer and a dataset we generated to study methylation changes in patients with long-standing ulcerative colitis. We show that several of the gene sets that seem enriched would also be identified with randomized data. We suggest two existing approaches that can be adapted to correct the bias. Accounting for the bias in the lung cancer and ulcerative colitis datasets provides novel biological insights into the role of methylation in cancer development and chronic inflammation, respectively. Our results have significant implications for many previous genome-wide methylation studies that have drawn conclusions on the basis of such strongly biased analysis.
dc.publisherOxford University Press (OUP)
dc.relation.ispartofBioinformatics
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 Ireland
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/3.0/ie/
dc.subjectcpg island shores
dc.subjectDNA methylation
dc.subjectlung-cancer
dc.subjectexpression
dc.subjecthypermethylation
dc.subjecttissue
dc.subjectresolution
dc.subjectlists
dc.subjectpatterns
dc.subjectontology
dc.titleGene-set analysis is severely biased when applied to genome-wide methylation data
dc.typeArticle
dc.identifier.doi10.1093/bioinformatics/btt311
dc.local.publishedsourcehttps://academic.oup.com/bioinformatics/article-pdf/29/15/1851/16913699/btt311.pdf
nui.item.downloads0


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 Ireland
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Ireland