Automatic module selection from several microarray gene expression studies
SUMMARY Independence of genes is commonly but incorrectly assumed in microarray data analysis; rather, genes are activated in co-regulated sets referred to as modules. In this article, we develop an automatic method to define modules common to multiple independent studies. We use an empirical Bayes...
Saved in:
Published in: | Biostatistics (Oxford, England) Vol. 19; no. 2; pp. 153 - 168 |
---|---|
Main Authors: | , , |
Format: | Journal Article |
Language: | English |
Published: |
England
Oxford University Press
01-04-2018
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | SUMMARY
Independence of genes is commonly but incorrectly assumed in microarray data analysis; rather, genes are activated in co-regulated sets referred to as modules. In this article, we develop an automatic method to define modules common to multiple independent studies. We use an empirical Bayes procedure to estimate a sparse correlation matrix for all studies, identify modules by clustering, and develop an extreme-value-based method to detect so-called scattered genes, which do not belong to any module. The resulting algorithm is very fast and produces accurate modules in simulation studies. Application to real data identifies modules with significant enrichment and results in a huge dimension reduction, which can alleviate the computational burden of further analyses. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 1465-4644 1468-4357 |
DOI: | 10.1093/biostatistics/kxx032 |