Large-Scale Expectile Regression With Covariates Missing at Random

Analysis of large volumes of data is very complex due to not only a high level of skewness and heteroscedasticity of variance but also the phenomenon of missing data. Expectile regression is a popular alternative method of analyzing heterogeneous data. In this paper, we consider fitting a linear exp...

Full description

Saved in:
Bibliographic Details
Published in:IEEE access Vol. 8; pp. 36502 - 36513
Main Authors: Pan, Yingli, Liu, Zhan, Cai, Wen
Format: Journal Article
Language:English
Published: Piscataway IEEE 2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Analysis of large volumes of data is very complex due to not only a high level of skewness and heteroscedasticity of variance but also the phenomenon of missing data. Expectile regression is a popular alternative method of analyzing heterogeneous data. In this paper, we consider fitting a linear expectile regression model for estimating conditional expectiles based on a large quantity of data with covariates missing at random. We construct a communication-efficient surrogate loss (CSL) function to estimate model parameters. The asymptotic normality of the proposed estimator is established. A proximal alternating direction method of multipliers (ADMM) algorithm is developed for distributed statistical optimization on a large quantity of data. Simulation studies are performed to assess the finite-sample performance of the proposed method. Survey data from the Behavioral Risk Factor Surveillance System (BRFSS) is used to demonstrate the utility of the proposed method in practice.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2020.2970741