Valuing free-form text data from maintenance logs through transfer learning with CamemBERT

Coupling a production scheduling process with maintenance logs can provide important advantages. For instance, this enables the adaptation of planning to the reality of the shop floor. Nevertheless, maintenance logs are often highly unstructured, as they mainly rely on free-form text comments from o...

Full description

Saved in:
Bibliographic Details
Published in:Enterprise information systems Vol. 16; no. 6; pp. 1 - 29
Main Authors: Usuga Cadavid, Juan Pablo, Grabot, Bernard, Lamouri, Samir, Pellerin, Robert, Fortin, Arnaud
Format: Journal Article
Language:English
Published: Taylor & Francis 03-06-2022
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Coupling a production scheduling process with maintenance logs can provide important advantages. For instance, this enables the adaptation of planning to the reality of the shop floor. Nevertheless, maintenance logs are often highly unstructured, as they mainly rely on free-form text comments from operators, and are imbalanced, as commonplace issues happen more often than critical problems. This hinders the application of machine learning methods to exploit this data. Thus, this study explores the use of a recent model named CamemBERT to tackle these difficulties through transfer learning. More specifically, the purpose is to predict the criticality and duration of a maintenance issue from the description provided. Findings suggest that fine-tuning CamemBERT outperforms other classical and feature-based approaches. Furthermore, the class imbalance problem is addressed from a data pre-processing and training perspective: firstly, k-means with silhouette diagrams allowed the creation of more homogenous classes, and secondly, the use of resampling enabled an improvement in the model's performance.
ISSN:1751-7575
1751-7583
DOI:10.1080/17517575.2020.1790043