Artificial intelligence in orthopaedics: can Chat Generative Pre-trained Transformer (ChatGPT) pass Section 1 of the Fellowship of the Royal College of Surgeons (Trauma &amp; Orthopaedics) examination?

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Purpose

Chat Generative Pre-trained Transformer (ChatGPT) is a large language artificial intelligence (AI) model which generates contextually relevant text in response to questioning. After ChatGPT successfully passed the United States Medical Licensing Examinations, proponents have argued it should play an increasing role in medical service provision and education. AI in healthcare remains in its infancy, and the reliability of AI systems must be scrutinized. This study assessed whether ChatGPT could pass Section 1 of the Fellowship of the Royal College of Surgeons (FRCS) examination in Trauma and Orthopaedic Surgery.

Methods

The UK and Ireland In-Training Examination (UKITE) was used as a surrogate for the FRCS. Papers 1 and 2 of UKITE 2022 were directly inputted into ChatGPT. All questions were in a single-best-answer format without wording alterations. Imaging was trialled to ensure ChatGPT utilized this information.

Results

ChatGPT scored 35.8%: 30% lower than the FRCS pass rate and 8.2% lower than the mean score achieved by human candidates of all training levels. Subspecialty analysis demonstrated ChatGPT scored highest in basic science (53.3%) and lowest in trauma (0%). In 87 questions answered incorrectly, ChatGPT only stated it did not know the answer once and gave incorrect explanatory answers for the remaining questions.

Conclusion

ChatGPT is currently unable to exert the higher-order judgement and multilogical thinking required to pass the FRCS examination. Further, the current model fails to recognize its own limitations. ChatGPT’s deficiencies should be publicized equally as much as its successes to ensure clinicians remain aware of its fallibility.

Key messages

What is already known on this topic

Following ChatGPT’s much-publicized success in passing the United States Medical Licensing Examinations, clinicians and medical students are using the model increasingly frequently for medical service provision and education. However ChatGPT remains in its infancy, and the model’s reliability and accuracy remain unproven.

What this study adds

This study demonstrates ChatGPT is currently unable to exert the higher-order judgement and multilogical thinking required to pass the Fellowship of the Royal College of Surgeons (FRCS) (Trauma & Orthopaedics) examination. Further, the current model fails to recognize its own limitations when offering both direct and explanatory answers.

How this study might affect research, practice, or policy

This study highlights the need for medical students and clinicians to exert caution when employing ChatGPT as a revision tool or applying it in clinical practice, and for patients to be aware of its fallibilities when using it as a health resource. Future research questions include:

Related collections

Most cited references 12

Record: found
Abstract: found
Article: found

Is Open Access

Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models

Tiffany H. Kung, Morgan Cheatham, Arielle Medenilla … (2023)

We evaluated the performance of a large language model called ChatGPT on the United States Medical Licensing Exam (USMLE), which consists of three exams: Step 1, Step 2CK, and Step 3. ChatGPT performed at or near the passing threshold for all three exams without any specialized training or reinforcement. Additionally, ChatGPT demonstrated a high level of concordance and insight in its explanations. These results suggest that large language models may have the potential to assist with medical education, and potentially, clinical decision-making.

0 comments Cited 597 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum

John W. Ayers, Adam Poliak, Mark Dredze … (2023)

Importance The rapid expansion of virtual health care has caused a surge in patient messages concomitant with more work and burnout among health care professionals. Artificial intelligence (AI) assistants could potentially aid in creating answers to patient questions by drafting responses that could be reviewed by clinicians. Objective To evaluate the ability of an AI chatbot assistant (ChatGPT), released in November 2022, to provide quality and empathetic responses to patient questions. Design, Setting, and Participants In this cross-sectional study, a public and nonidentifiable database of questions from a public social media forum (Reddit’s r/AskDocs) was used to randomly draw 195 exchanges from October 2022 where a verified physician responded to a public question. Chatbot responses were generated by entering the original question into a fresh session (without prior questions having been asked in the session) on December 22 and 23, 2022. The original question along with anonymized and randomly ordered physician and chatbot responses were evaluated in triplicate by a team of licensed health care professionals. Evaluators chose “which response was better” and judged both “the quality of information provided” ( very poor , poor , acceptable , good , or very good ) and “the empathy or bedside manner provided” ( not empathetic , slightly empathetic , moderately empathetic , empathetic , and very empathetic ). Mean outcomes were ordered on a 1 to 5 scale and compared between chatbot and physicians. Results Of the 195 questions and responses, evaluators preferred chatbot responses to physician responses in 78.6% (95% CI, 75.0%-81.8%) of the 585 evaluations. Mean (IQR) physician responses were significantly shorter than chatbot responses (52 [17-62] words vs 211 [168-245] words; t = 25.4; P &lt; .001). Chatbot responses were rated of significantly higher quality than physician responses ( t = 13.3; P &lt; .001). The proportion of responses rated as good or very good quality (≥ 4), for instance, was higher for chatbot than physicians (chatbot: 78.5%, 95% CI, 72.3%-84.1%; physicians: 22.1%, 95% CI, 16.4%-28.2%;). This amounted to 3.6 times higher prevalence of good or very good quality responses for the chatbot. Chatbot responses were also rated significantly more empathetic than physician responses ( t = 18.9; P &lt; .001). The proportion of responses rated empathetic or very empathetic (≥4) was higher for chatbot than for physicians (physicians: 4.6%, 95% CI, 2.1%-7.7%; chatbot: 45.1%, 95% CI, 38.5%-51.8%; physicians: 4.6%, 95% CI, 2.1%-7.7%). This amounted to 9.8 times higher prevalence of empathetic or very empathetic responses for the chatbot. Conclusions In this cross-sectional study, a chatbot generated quality and empathetic responses to patient questions posed in an online forum. Further exploration of this technology is warranted in clinical settings, such as using chatbot to draft responses that physicians could then edit. Randomized trials could assess further if using AI assistants might improve responses, lower clinician burnout, and improve patient outcomes.

0 comments Cited 267 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Artificial Intelligence in Precision Cardiovascular Medicine.

Chayakrit Krittanawong, HongJu Zhang, Zhen Wang … (2017)

Artificial intelligence (AI) is a field of computer science that aims to mimic human thought processes, learning capacity, and knowledge storage. AI techniques have been applied in cardiovascular medicine to explore novel genotypes and phenotypes in existing diseases, improve the quality of patient care, enable cost-effectiveness, and reduce readmission and mortality rates. Over the past decade, several machine-learning techniques have been used for cardiovascular disease diagnosis and prediction. Each problem requires some degree of understanding of the problem, in terms of cardiovascular medicine and statistics, to apply the optimal machine-learning algorithm. In the near future, AI will result in a paradigm shift toward precision cardiovascular medicine. The potential of AI in cardiovascular medicine is tremendous; however, ignorance of the challenges may overshadow its potential clinical impact. This paper gives a glimpse of AI's application in cardiovascular clinical care and discusses its potential role in facilitating precision cardiovascular medicine.

0 comments Cited 266 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Rory Cuthbert: (View ORCID Profile)

Ashley I Simpson: (View ORCID Profile)

Journal

Title: Postgraduate Medical Journal

Publisher: Oxford University Press (OUP)

ISSN (Print): 0032-5473

ISSN (Electronic): 1469-0756

Publication date Created: October 2023

Publication date Created: September 21 2023

Publication date Created: July 06 2023

Publication date Other: October 2023

Publication date (Print): September 21 2023

Publication date (Electronic): July 06 2023

Volume: 99

Issue: 1176

Pages: 1110-1114

Article

DOI: 10.1093/postmj/qgad053

SO-VID: 11399809-c3b7-4a8c-ad60-d7f0e77eb07a

License:

https://academic.oup.com/pages/standard-publication-reuse-rights

History

Data availability:

Comments

Comment on this article

scite_

Cited by 7

See all cited by

Most referenced authors 119

See all reference authors

Artificial intelligence in orthopaedics: can Chat Generative Pre-trained Transformer (ChatGPT) pass Section 1 of the Fellowship of the Royal College of Surgeons (Trauma & Orthopaedics) examination?

Read this article at

Abstract

Purpose

Methods

Results

Conclusion

Key messages

What is already known on this topic

What this study adds

How this study might affect research, practice, or policy

Related collections

Journal of College of Sharia and Islamic Studies (JCSIS)

Most cited references 12

Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models

Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum

Artificial Intelligence in Precision Cardiovascular Medicine.

Author and article information

Contributors

Journal

Article

History

Comments

Comment on this article

Similar content 9

Cited by 7

Most referenced authors 119