Valuing free-form text data from maintenance logs through transfer learning with CamemBERT
Coupling a production scheduling process with maintenance logs can provide important advantages. For instance, this enables the adaptation of planning to the reality of the shop floor. Nevertheless, maintenance logs are often highly unstructured, as they mainly rely on free-form text comments from o...
Saved in:
Published in: | Enterprise information systems Vol. 16; no. 6; pp. 1 - 29 |
---|---|
Main Authors: | , , , , |
Format: | Journal Article |
Language: | English |
Published: |
Taylor & Francis
03-06-2022
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Coupling a production scheduling process with maintenance logs can provide important advantages. For instance, this enables the adaptation of planning to the reality of the shop floor. Nevertheless, maintenance logs are often highly unstructured, as they mainly rely on free-form text comments from operators, and are imbalanced, as commonplace issues happen more often than critical problems. This hinders the application of machine learning methods to exploit this data. Thus, this study explores the use of a recent model named CamemBERT to tackle these difficulties through transfer learning. More specifically, the purpose is to predict the criticality and duration of a maintenance issue from the description provided. Findings suggest that fine-tuning CamemBERT outperforms other classical and feature-based approaches. Furthermore, the class imbalance problem is addressed from a data pre-processing and training perspective: firstly, k-means with silhouette diagrams allowed the creation of more homogenous classes, and secondly, the use of resampling enabled an improvement in the model's performance. |
---|---|
ISSN: | 1751-7575 1751-7583 |
DOI: | 10.1080/17517575.2020.1790043 |