VSN: Variable sorting for normalization

Spectrometric and analytical techniques in general collect multivariate signals from chemical or biological materials by means of a specific measurement instrumentation, usually in order to characterize or classify them through the estimation of one of several compounds of interest. However, measure...

Full description

Saved in:
Bibliographic Details
Published in:Journal of chemometrics Vol. 34; no. 2
Main Authors: Rabatel, Gilles, Marini, Federico, Walczak, Beata, Roger, Jean‐Michel
Format: Journal Article
Language:English
Published: Chichester Wiley Subscription Services, Inc 01-02-2020
Wiley
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Spectrometric and analytical techniques in general collect multivariate signals from chemical or biological materials by means of a specific measurement instrumentation, usually in order to characterize or classify them through the estimation of one of several compounds of interest. However, measurement conditions might induce various additive (baseline) or multiplicative effects on the collected signals, which may jeopardize the accuracy and generalizability of estimation models. A common way of dealing with such issues is signal normalization and in particular, when the baseline is constant, the standard normal variate (SNV) transform. Despite its efficiency, SNV has important drawbacks, in terms of physical interpretation and robustness of estimation models, because all the variables are equally considered, independently on what their actual relationship with the response(s) of interest is. In the present study, a novel algorithm is proposed, named variable sorting for normalization (VSN). This algorithm automatically produces, for a given set of multivariate signals, a weighting function favoring signal variables that are only impacted by additive and multiplicative effects, and not by the response(s) of interest. When introduced in SNV preprocessing, this weighting function significantly improves signal shape and model interpretation. Moreover, VSN can be successfully used not only for constant but also with more complex baselines, such as polynomial ones. Together with the description of the theory behind VSN, its application on various synthetic multivariate data, as well as on real SWIR spectral data, is presented and discussed. A common way of dealing with variations in measurement conditions is signal normalization, which may have important drawbacks, in terms of physical interpretation and robustness of models. In the present study, a novel algorithm is proposed. It automatically produces a weighting function favoring variables that are only impacted by additive and multiplicative effects. When introduced in a normalization preprocessing, this weighting function significantly improves signal shape and model interpretation.
ISSN:0886-9383
1099-128X
DOI:10.1002/cem.3164