Incorporating domain knowledge into machine learning for laser-induced breakdown spectroscopy quantification

During the last decade, various machine learning methods have been applied to improve the accuracy of quantitative analysis in laser-induced breakdown spectroscopy (LIBS) by modelling the complex relationship between spectral intensity and analyte concentration. However, machine learning methods ten...

Full description

Saved in:
Bibliographic Details
Published in:Spectrochimica acta. Part B: Atomic spectroscopy Vol. 195; p. 106490
Main Authors: Song, Weiran, Hou, Zongyu, Gu, Weilun, Afgan, Muhammad Sher, Cui, Jiacheng, Wang, Hui, Wang, Yun, Wang, Zhe
Format: Journal Article
Language:English
Published: Elsevier B.V 01-09-2022
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:During the last decade, various machine learning methods have been applied to improve the accuracy of quantitative analysis in laser-induced breakdown spectroscopy (LIBS) by modelling the complex relationship between spectral intensity and analyte concentration. However, machine learning methods tend to have high model complexity and are difficult to interpret their predictions. Moreover, their decision-making mechanisms rarely consider the physical principles behind quantitative analysis, resulting in a reduction of LIBS quantification accuracy and a question of trust in the quantification results. This work investigates the feasibility of incorporating domain knowledge into machine learning to improve LIBS quantification performance. A new regression method based on dominant factor and kernel extreme learning machine is proposed, namely DF-K-ELM. It uses knowledge-based spectral lines, related to analyte compositions, to construct a linear physical principle based model and adopts K-ELM to account for the residuals of the linear model. DF-K-ELM intuitively explains how knowledge-based spectral lines influence prediction results and improves model interpretability without reducing model complexity. The proposed method, DF-K-ELM, is tested on 10 regression tasks based on 3 LIBS datasets and compared with 6 baseline methods. It achieves the best and second best performance on 4 and 2 tasks, respectively. Moreover, compared to traditional machine learning methods, dominant factor based methods yield higher accuracy in most cases. Such results demonstrate that incorporating domain knowledge into machine learning is a viable approach to improve the performance of LIBS quantification. [Display omitted] •Machine learning combined with expert knowledge.•High model complexity and improved model interpretability.•Good generalization performance for LIBS quantification.
ISSN:0584-8547
1873-3565
DOI:10.1016/j.sab.2022.106490