Diagnostic suspicion bias and machine learning: Breaking the awareness deadlock for sepsis detection

Many early warning algorithms are downstream of clinical evaluation and diagnostic testing, which means that they may not be useful when clinicians fail to suspect illness and fail to order appropriate tests. Depending on how such algorithms handle missing data, they could even indicate “low risk” s...

Full description

Saved in:
Bibliographic Details
Published in:PLOS digital health Vol. 2; no. 11; p. e0000365
Main Authors: Prasad, Varesh, Aydemir, Baturay, Kehoe, Iain E., Kotturesh, Chaya, O’Connell, Abigail, Biebelberg, Brett, Wang, Yang, Lynch, James C., Pepino, Jeremy A., Filbin, Michael R., Heldt, Thomas, Reisner, Andrew T.
Format: Journal Article
Language:English
Published: San Francisco Public Library of Science 01-11-2023
Public Library of Science (PLoS)
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Many early warning algorithms are downstream of clinical evaluation and diagnostic testing, which means that they may not be useful when clinicians fail to suspect illness and fail to order appropriate tests. Depending on how such algorithms handle missing data, they could even indicate “low risk” simply because the testing data were never ordered. We considered predictive methodologies to identify sepsis at triage, before diagnostic tests are ordered, in a busy Emergency Department (ED). One algorithm used “bland clinical data” (data available at triage for nearly every patient). The second algorithm added three yes/no questions to be answered after the triage interview. Retrospectively, we studied adult patients from a single ED between 2014–16, separated into training (70%) and testing (30%) cohorts, and a final validation cohort of patients from four EDs between 2016–2018. Sepsis was defined per the Rhee criteria. Investigational predictors were demographics and triage vital signs (downloaded from the hospital EMR); past medical history; and the auxiliary queries (answered by chart reviewers who were blinded to all data except the triage note and initial HPI). We developed L2-regularized logistic regression models using a greedy forward feature selection. There were 1164, 499, and 784 patients in the training, testing, and validation cohorts, respectively. The bland clinical data model yielded ROC AUC’s 0.78 (0.76–0.81) and 0.77 (0.73–0.81), for training and testing, respectively, and ranged from 0.74–0.79 in four hospital validation. The second model which included auxiliary queries yielded 0.84 (0.82–0.87) and 0.83 (0.79–0.86), and ranged from 0.78–0.83 in four hospital validation. The first algorithm did not require clinician input but yielded middling performance. The second showed a trend towards superior performance, though required additional user effort. These methods are alternatives to predictive algorithms downstream of clinical evaluation and diagnostic testing. For hospital early warning algorithms, consideration should be given to bias and usability of various methods.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
I have read the journal’s policy and the authors of this manuscript have the following competing interests: Investigators ATR, MRF, and TH hold a patent related to sepsis patient management (#WO2016133928A1) which has been licensed to the Nihon Kohden Corporation. These competing interests will not alter adherence to PLOS policies on sharing data and materials.
ISSN:2767-3170
2767-3170
DOI:10.1371/journal.pdig.0000365