Automatic module selection from several microarray gene expression studies

SUMMARY Independence of genes is commonly but incorrectly assumed in microarray data analysis; rather, genes are activated in co-regulated sets referred to as modules. In this article, we develop an automatic method to define modules common to multiple independent studies. We use an empirical Bayes...

Full description

Saved in:
Bibliographic Details
Published in:Biostatistics (Oxford, England) Vol. 19; no. 2; pp. 153 - 168
Main Authors: Zollinger, Alix, Davison, Anthony C, Goldstein, Darlene R
Format: Journal Article
Language:English
Published: England Oxford University Press 01-04-2018
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:SUMMARY Independence of genes is commonly but incorrectly assumed in microarray data analysis; rather, genes are activated in co-regulated sets referred to as modules. In this article, we develop an automatic method to define modules common to multiple independent studies. We use an empirical Bayes procedure to estimate a sparse correlation matrix for all studies, identify modules by clustering, and develop an extreme-value-based method to detect so-called scattered genes, which do not belong to any module. The resulting algorithm is very fast and produces accurate modules in simulation studies. Application to real data identifies modules with significant enrichment and results in a huge dimension reduction, which can alleviate the computational burden of further analyses.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1465-4644
1468-4357
DOI:10.1093/biostatistics/kxx032