Imputation of missing values for compositional data using classical and robust methods

New imputation algorithms for estimating missing values in compositional data are introduced. A first proposal uses the k -nearest neighbor procedure based on the Aitchison distance, a distance measure especially designed for compositional data. It is important to adjust the estimated missing values...

Full description

Saved in:

Bibliographic Details
Published in:	Computational statistics & data analysis Vol. 54; no. 12; pp. 3095 - 3107
Main Authors:	Hron, K., Templ, M., Filzmoser, P.
Format:	Journal Article
Language:	English
Published:	Elsevier B.V 01-12-2010 Elsevier
Series:	Computational Statistics & Data Analysis
Subjects:	Accounting Data processing Iterative methods Mathematical models Proposals Statistics
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	New imputation algorithms for estimating missing values in compositional data are introduced. A first proposal uses the k -nearest neighbor procedure based on the Aitchison distance, a distance measure especially designed for compositional data. It is important to adjust the estimated missing values to the overall size of the compositional parts of the neighbors. As a second proposal an iterative model-based imputation technique is introduced which initially starts from the result of the proposed k -nearest neighbor procedure. The method is based on iterative regressions, thereby accounting for the whole multivariate data information. The regressions have to be performed in a transformed space, and depending on the data quality classical or robust regression techniques can be employed. The proposed methods are tested on a real and on simulated data sets. The results show that the proposed methods outperform standard imputation methods. In the presence of outliers, the model-based method with robust regressions is preferable.
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23
ISSN:	0167-9473 1872-7352
DOI:	10.1016/j.csda.2009.11.023