An improved fuzzy based approach to impute missing values in DNA microarray gene expression data with collaborative filtering

DNA microarray experiments normally generate gene expression profiles in the form of high dimensional matrices. It may happen that DNA microarray gene expression values contain many missing values within its data due to several reasons like image disruption, hybridization error, dust, moderate resol...

Full description

Saved in:
Bibliographic Details
Published in:2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI) pp. 911 - 916
Main Authors: Saha, Sujay, Bandopadhyay, Saikat, Ghosh, Anupam, Dey, Kashi Nath
Format: Conference Proceeding
Language:English
Published: IEEE 01-09-2016
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:DNA microarray experiments normally generate gene expression profiles in the form of high dimensional matrices. It may happen that DNA microarray gene expression values contain many missing values within its data due to several reasons like image disruption, hybridization error, dust, moderate resolution etc. It will be very unfortunate if these missing values affect the performance of subsequent statistical and machine learning experiments significantly. There exist various missing value estimation algorithms. In this work we have proposed a modification to the existing imputation approach named as Collaborative Filtering Based on Rough-Set Theory (CFBRST) [10]. This proposed approach (CFBRSTFDV) uses Fuzzy Difference Vector (FDV) along with Rough Set based Collaborative Filtering that analyzes historical interactions and helps to estimate the missing values. This is a suggestion based system that works on the principle of how suggestion of items or products arrive to an individual while using FB, Twitter or looking for books in Amazon. We have applied our proposed algorithm on two benchmark dataset SPELLMAN & Tumor Cell (GDS2932) and the experiments show that the modified approach, CFBRSTFDV, outperforms the other existing state-of-the art methods as far as RMSE measures are concerned, particularly when we increase the number of missing values.
DOI:10.1109/ICACCI.2016.7732161