Cohort selection for clinical trials: n2c2 2018 shared task track 1
Abstract Objective Track 1 of the 2018 National NLP Clinical Challenges shared tasks focused on identifying which patients in a corpus of longitudinal medical records meet and do not meet identified selection criteria. Materials and Methods To address this challenge, we annotated American English cl...
Saved in:
Published in: | Journal of the American Medical Informatics Association : JAMIA Vol. 26; no. 11; pp. 1163 - 1171 |
---|---|
Main Authors: | , , , , |
Format: | Journal Article |
Language: | English |
Published: |
England
Oxford University Press
01-11-2019
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Abstract
Objective
Track 1 of the 2018 National NLP Clinical Challenges shared tasks focused on identifying which patients in a corpus of longitudinal medical records meet and do not meet identified selection criteria.
Materials and Methods
To address this challenge, we annotated American English clinical narratives for 288 patients according to whether they met these criteria. We chose criteria from existing clinical trials that represented a variety of natural language processing tasks, including concept extraction, temporal reasoning, and inference.
Results
A total of 47 teams participated in this shared task, with 224 participants in total. The participants represented 18 countries, and the teams submitted 109 total system outputs. The best-performing system achieved a micro F1 score of 0.91 using a rule-based approach. The top 10 teams used rule-based and hybrid systems to approach the problems.
Discussion
Clinical narratives are open to interpretation, particularly in cases where the selection criterion may be underspecified. This leaves room for annotators to use domain knowledge and intuition in selecting patients, which may lead to error in system outputs. However, teams who consulted medical professionals while building their systems were more likely to have high recall for patients, which is preferable for patient selection systems.
Conclusions
There is not yet a 1-size-fits-all solution for natural language processing systems approaching this task. Future research in this area can look to examining criteria requiring even more complex inferences, temporal reasoning, and domain knowledge. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 1067-5027 1527-974X |
DOI: | 10.1093/jamia/ocz163 |