Least absolute deviation estimator‐bridge variable selection and estimation for quantitative structure–activity relationship model
Regression models are frequently encountered in many scientific fields, especially in quantitative structure–activity relationship (QSAR) modeling. The traditional estimation of regression model parameters is based on the normal assumption of the response variable, and, therefore, it is sensitive to...
Saved in:
Published in: | Journal of chemometrics Vol. 33; no. 7 |
---|---|
Main Authors: | , |
Format: | Journal Article |
Language: | English |
Published: |
Chichester
Wiley Subscription Services, Inc
01-07-2019
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Regression models are frequently encountered in many scientific fields, especially in quantitative structure–activity relationship (QSAR) modeling. The traditional estimation of regression model parameters is based on the normal assumption of the response variable, and, therefore, it is sensitive to outliers or heavy‐tailed distributions. Robust penalized regression methods have been given considerable attention because they combine the robust estimation method with penalty terms to perform parameter estimation and variable selection simultaneously. In this paper, based on the bridge penalty, a robust variable selection and parameter estimation is proposed as a method that is resistant to the existence of outliers or heavy‐tailed errors. The basic idea is to combine the least absolute deviation estimator (LAD) and the bridge penalty together to produce the LAD‐bridge method. The effectiveness of the proposed method is examined through simulation studies and application to real chemometrics data. The obtained results confirm that the LAD‐bridge can significantly reduce the prediction error compared with other existing methods.
Outliers in the biological activity variable or the heavy‐tailed distribution of the error are often encountered in practice. Under these circumstances, the quantitative structure‐activity relationship (QSAR) model using multiple linear regression is not efficient. In this paper, based on the bridge penalty, a robust variable selection and parameter estimation is proposed as a method that is resistant to the existence of outliers or heavy‐tailed errors by combining the least absolute deviation estimator (LAD) and the bridge penalty together to produce the LAD‐bridge method. The results demonstrate the effectiveness of our proposed method in simultaneously estimating robust QSAR model and selecting informative molecular descriptors. |
---|---|
ISSN: | 0886-9383 1099-128X |
DOI: | 10.1002/cem.3139 |