BERT-Based Clinical Name Entity Reorganization Model for Health Diagnosis

The National Health and Family Planning Commission requires medical institutions to use the International Classification of Diseases (ICD) codes. However, due to many commonly used words in clinical disease descriptions, the direct mapping matching rate between the diagnosis names entered in the ele...

Full description

Saved in:
Bibliographic Details
Published in:Disease markers Vol. 2022; pp. 1 - 12
Main Authors: Borole, Yogini Dilip, Agrawal, Anurag Vijay, Tesfayohanis, Miretab, Thakkar, Dhruv, Abonazel, Mohamed R., Awwad, Fuad A.
Format: Journal Article
Language:English
Published: Amsterdam Hindawi 11-10-2022
Hindawi Limited
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The National Health and Family Planning Commission requires medical institutions to use the International Classification of Diseases (ICD) codes. However, due to many commonly used words in clinical disease descriptions, the direct mapping matching rate between the diagnosis names entered in the electronic medical records and the ICD codes is low. In this paper, based on the actual diagnostic data on the regional health platform, a disease term map incorporating standard terms was constructed. Specifically, based on the rule algorithm based on the components of the disease, a data-enhanced BERT (bidirectional encoder representation from transformers) upper and lower relationship recognition algorithm is proposed. Synonymous upper and lower relationships identify diseases, and the hierarchical structure is further integrated. In addition, a task assignment based on the association map of disease departments is also proposed. Methods were used for manual verification, and finally, 94,478 disease entities formed a large-scale disease term map, including 1,460 synonymous relationships and 46,508 hyponymous relationships. Evaluation experiments show that, based on the disease term map and clinical diagnosis, the coverage rate of diagnostic data is 75.31% higher than direct mapping coding based on ICD. In addition, using the disease term map to code diseases automatically will shorten the coding time by about 59.75% compared with manual coding by doctors, and the correct rate is 85%.
ISSN:0278-0240
1875-8630
DOI:10.1155/2022/2297063