Least absolute deviation estimator‐bridge variable selection and estimation for quantitative structure–activity relationship model

Regression models are frequently encountered in many scientific fields, especially in quantitative structure–activity relationship (QSAR) modeling. The traditional estimation of regression model parameters is based on the normal assumption of the response variable, and, therefore, it is sensitive to...

Full description

Saved in:
Bibliographic Details
Published in:Journal of chemometrics Vol. 33; no. 7
Main Authors: Al‐Dabbagh, Zainab Tawfeeq, Algamal, Zakariya Yahya
Format: Journal Article
Language:English
Published: Chichester Wiley Subscription Services, Inc 01-07-2019
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Regression models are frequently encountered in many scientific fields, especially in quantitative structure–activity relationship (QSAR) modeling. The traditional estimation of regression model parameters is based on the normal assumption of the response variable, and, therefore, it is sensitive to outliers or heavy‐tailed distributions. Robust penalized regression methods have been given considerable attention because they combine the robust estimation method with penalty terms to perform parameter estimation and variable selection simultaneously. In this paper, based on the bridge penalty, a robust variable selection and parameter estimation is proposed as a method that is resistant to the existence of outliers or heavy‐tailed errors. The basic idea is to combine the least absolute deviation estimator (LAD) and the bridge penalty together to produce the LAD‐bridge method. The effectiveness of the proposed method is examined through simulation studies and application to real chemometrics data. The obtained results confirm that the LAD‐bridge can significantly reduce the prediction error compared with other existing methods. Outliers in the biological activity variable or the heavy‐tailed distribution of the error are often encountered in practice. Under these circumstances, the quantitative structure‐activity relationship (QSAR) model using multiple linear regression is not efficient. In this paper, based on the bridge penalty, a robust variable selection and parameter estimation is proposed as a method that is resistant to the existence of outliers or heavy‐tailed errors by combining the least absolute deviation estimator (LAD) and the bridge penalty together to produce the LAD‐bridge method. The results demonstrate the effectiveness of our proposed method in simultaneously estimating robust QSAR model and selecting informative molecular descriptors.
ISSN:0886-9383
1099-128X
DOI:10.1002/cem.3139