Learning to Ask Like a Physician
Existing question answering (QA) datasets derived from electronic health records (EHR) are artificially generated and consequently fail to capture realistic physician information needs. We present Discharge Summary Clinical Questions (DiSCQ), a newly curated question dataset composed of 2,000+ quest...
Saved in:
Main Authors: | , , , , , , , , , , , , , , , , , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
06-06-2022
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Existing question answering (QA) datasets derived from electronic health
records (EHR) are artificially generated and consequently fail to capture
realistic physician information needs. We present Discharge Summary Clinical
Questions (DiSCQ), a newly curated question dataset composed of 2,000+
questions paired with the snippets of text (triggers) that prompted each
question. The questions are generated by medical experts from 100+ MIMIC-III
discharge summaries. We analyze this dataset to characterize the types of
information sought by medical experts. We also train baseline models for
trigger detection and question generation (QG), paired with unsupervised answer
retrieval over EHRs. Our baseline model is able to generate high quality
questions in over 62% of cases when prompted with human selected triggers. We
release this dataset (and all code to reproduce baseline model results) to
facilitate further research into realistic clinical QA and QG:
https://github.com/elehman16/discq. |
---|---|
DOI: | 10.48550/arxiv.2206.02696 |