Predicting seizure recurrence after an initial seizure-like episode from routine clinical notes using large language models: a retrospective cohort study

The evaluation and management of first-time seizure-like events in children can be difficult because these episodes are not always directly observed and might be epileptic seizures or other conditions (seizure mimics). We aimed to evaluate whether machine learning models using real-world data could...

Full description

Saved in:

Bibliographic Details
Published in:	The Lancet. Digital health Vol. 5; no. 12; pp. e882 - e894
Main Authors:	Beaulieu-Jones, Brett K, Villamar, Mauricio F, Scordis, Phil, Bartmann, Ana Paula, Ali, Waqar, Wissel, Benjamin D, Alsentzer, Emily, de Jong, Johann, Patra, Arijit, Kohane, Isaac
Format:	Journal Article
Language:	English
Published:	England 01-12-2023
Subjects:	Adult Child Electronic Health Records Epilepsy Humans Machine Learning Retrospective Studies Seizures - diagnosis Young Adult
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The evaluation and management of first-time seizure-like events in children can be difficult because these episodes are not always directly observed and might be epileptic seizures or other conditions (seizure mimics). We aimed to evaluate whether machine learning models using real-world data could predict seizure recurrence after an initial seizure-like event. This retrospective cohort study compared models trained and evaluated on two separate datasets between Jan 1, 2010, and Jan 1, 2020: electronic medical records (EMRs) at Boston Children's Hospital and de-identified, patient-level, administrative claims data from the IBM MarketScan research database. The study population comprised patients with an initial diagnosis of either epilepsy or convulsions before the age of 21 years, based on International Classification of Diseases, Clinical Modification (ICD-CM) codes. We compared machine learning-based predictive modelling using structured data (logistic regression and XGBoost) with emerging techniques in natural language processing by use of large language models. The primary cohort comprised 14 021 patients at Boston Children's Hospital matching inclusion criteria with an initial seizure-like event and the comparison cohort comprised 15 062 patients within the IBM MarketScan research database. Seizure recurrence based on a composite expert-derived definition occurred in 57% of patients at Boston Children's Hospital and 63% of patients within IBM MarketScan. Large language models with additional domain-specific and location-specific pre-training on patients excluded from the study (F1-score 0·826 [95% CI 0·817-0·835], AUC 0·897 [95% CI 0·875-0·913]) performed best. All large language models, including the base model without additional pre-training (F1-score 0·739 [95% CI 0·738-0·741], AUROC 0·846 [95% CI 0·826-0·861]) outperformed models trained with structured data. With structured data only, XGBoost outperformed logistic regression and XGBoost models trained with the Boston Children's Hospital EMR (logistic regression: F1-score 0·650 [95% CI 0·643-0·657], AUC 0·694 [95% CI 0·685-0·705], XGBoost: F1-score 0·679 [0·676-0·683], AUC 0·725 [0·717-0·734]) performed similarly to models trained on the IBM MarketScan database (logistic regression: F1-score 0·596 [0·590-0·601], AUC 0·670 [0·664-0·675], XGBoost: F1-score 0·678 [0·668-0·687], AUC 0·710 [0·703-0·714]). Physician's clinical notes about an initial seizure-like event include substantial signals for prediction of seizure recurrence, and additional domain-specific and location-specific pre-training can significantly improve the performance of clinical large language models, even for specialised cohorts. UCB, National Institute of Neurological Disorders and Stroke (US National Institutes of Health).
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Contributors BKB-J contributed to the conception or design of the work; data acquisition (pre-processing), analysis and interpretation; software development; and drafting and revising the manuscript. PS contributed to the conception or design of the work and revising the manuscript. IK contributed to the conception or design of the work, data acquisition and interpretation, and drafting and revising the manuscript. APB and MFV contributed to data interpretation and drafting and revising the manuscript. BDW and EA contributed to data interpretation and revising the manuscript. JdJ contributed to revising the manuscript. WA contributed to data interpretation and drafting and revising the manuscript. AP contributed to revising the manuscript. BKB-J and IK had access to all the data from Boston Children’s Hospital, and WA, AP, and BKB-J had access to all the data from IBM MarketScan, and have verified the results.
ISSN:	2589-7500 2589-7500
DOI:	10.1016/S2589-7500(23)00179-6