Evaluation of Rhinoplasty Information from ChatGPT, Gemini, and Claude for Readability and Accuracy
Assessment of the readability, accuracy, quality, and completeness of ChatGPT (Open AI, San Francisco, CA), Gemini (Google, Mountain View, CA), and Claude (Anthropic, San Francisco, CA) responses to common questions about rhinoplasty. Ten questions commonly encountered in the senior author's (S...
Saved in:
Published in: | Aesthetic plastic surgery |
---|---|
Main Authors: | , , , , , , , |
Format: | Journal Article |
Language: | English |
Published: |
United States
16-09-2024
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Assessment of the readability, accuracy, quality, and completeness of ChatGPT (Open AI, San Francisco, CA), Gemini (Google, Mountain View, CA), and Claude (Anthropic, San Francisco, CA) responses to common questions about rhinoplasty.
Ten questions commonly encountered in the senior author's (SPM) rhinoplasty practice were presented to ChatGPT-4, Gemini and Claude. Seven Facial Plastic and Reconstructive Surgeons with experience in rhinoplasty were asked to evaluate these responses for accuracy, quality, completeness, relevance, and use of medical jargon on a Likert scale. The responses were also evaluated using several readability indices.
ChatGPT achieved significantly higher evaluator scores for accuracy, and overall quality but scored significantly lower on completeness compared to Gemini and Claude. All three chatbot responses to the ten questions were rated as neutral to incomplete. All three chatbots were found to use medical jargon and scored at a college reading level for readability scores.
Rhinoplasty surgeons should be aware that the medical information found on chatbot platforms is incomplete and still needs to be scrutinized for accuracy. However, the technology does have potential for use in healthcare education by training it on evidence-based recommendations and improving readability.
This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine ratings, please refer to the Table of Contents or the online Instructions to Authors www.springer.com/00266 . |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 1432-5241 1432-5241 |
DOI: | 10.1007/s00266-024-04343-0 |