I/NI-calls for the exclusion of non-informative genes: a highly effective filtering tool for microarray data

Motivation: DNA microarray technology typically generates many measurements of which only a relatively small subset is informative for the interpretation of the experiment. To avoid false positive results, it is therefore critical to select the informative genes from the large noisy data before the...

Full description

Saved in:
Bibliographic Details
Published in:Bioinformatics Vol. 23; no. 21; pp. 2897 - 2902
Main Authors: Talloen, Willem, Clevert, Djork-Arné, Hochreiter, Sepp, Amaratunga, Dhammika, Bijnens, Luc, Kass, Stefan, Göhlmann, Hinrich W.H.
Format: Journal Article
Language:English
Published: Oxford Oxford University Press 01-11-2007
Oxford Publishing Limited (England)
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Motivation: DNA microarray technology typically generates many measurements of which only a relatively small subset is informative for the interpretation of the experiment. To avoid false positive results, it is therefore critical to select the informative genes from the large noisy data before the actual analysis. Most currently available filtering techniques are supervised and therefore suffer from a potential risk of overfitting. The unsupervised filtering techniques, on the other hand, are either not very efficient or too stringent as they may mix up signal with noise. We propose to use the multiple probes measuring the same target mRNA as repeated measures to quantify the signal-to-noise ratio of that specific probe set. A Bayesian factor analysis with specifically chosen prior settings, which models this probe level information, is providing an objective feature filtering technique, named informative/non-informative calls (I/NI calls). Results: Based on 30 real-life data sets (including various human, rat, mice and Arabidopsis studies) and a spiked-in data set, it is shown that I/NI calls is highly effective, with exclusion rates ranging from 70% to 99%. Consequently, it offers a critical solution to the curse of high-dimensionality in the analysis of microarray data. Availability: This filtering approach is publicly available as a function implemented in the R package FARMS (www.bioinf.jku.at/software/farms/farms.html). Contact: wtalloen@prdbe.jnj.com Supplementary information: Supplementary data are available at Bioinformatics online.
Bibliography:The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors.
istex:187F4806AFF5EA843F113BAC3B6EF60BF1D70B3C
To whom correspondence should be addressed.
ark:/67375/HXZ-0RB7XF8W-Q
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1367-4803
1460-2059
1367-4811
DOI:10.1093/bioinformatics/btm478