41
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: not found
      • Article: not found

      Benefits, Limits, and Risks of GPT-4 as an AI Chatbot for Medicine

      1 , 1 , 1
      , ,
      New England Journal of Medicine
      Massachusetts Medical Society

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Related collections

          Most cited references5

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models

          We evaluated the performance of a large language model called ChatGPT on the United States Medical Licensing Exam (USMLE), which consists of three exams: Step 1, Step 2CK, and Step 3. ChatGPT performed at or near the passing threshold for all three exams without any specialized training or reinforcement. Additionally, ChatGPT demonstrated a high level of concordance and insight in its explanations. These results suggest that large language models may have the potential to assist with medical education, and potentially, clinical decision-making.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Deep Learning Applications in Medical Image Analysis

              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Machine learning for patient risk stratification: standing on, or looking over, the shoulders of clinicians?

              Machine learning can help clinicians to make individualized patient predictions only if researchers demonstrate models that contribute novel insights, rather than learning the most likely next step in a set of actions a clinician will take. We trained deep learning models using only clinician-initiated, administrative data for 42.9 million admissions using three subsets of data: demographic data only, demographic data and information available at admission, and the previous data plus charges recorded during the first day of admission. Models trained on charges during the first day of admission achieve performance close to published full EMR-based benchmarks for inpatient outcomes: inhospital mortality (0.89 AUC), prolonged length of stay (0.82 AUC), and 30-day readmission rate (0.71 AUC). Similar performance between models trained with only clinician-initiated data and those trained with full EMR data purporting to include information about patient state and physiology should raise concern in the deployment of these models. Furthermore, these models exhibited significant declines in performance when evaluated over only myocardial infarction (MI) patients relative to models trained over MI patients alone, highlighting the importance of physician diagnosis in the prognostic performance of these models. These results provide a benchmark for predictive accuracy trained only on prior clinical actions and indicate that models with similar performance may derive their signal by looking over clinician’s shoulders—using clinical behavior as the expression of preexisting intuition and suspicion to generate a prediction. For models to guide clinicians in individual decisions, performance exceeding these benchmarks is necessary.
                Bookmark

                Author and article information

                Contributors
                Journal
                New England Journal of Medicine
                N Engl J Med
                Massachusetts Medical Society
                0028-4793
                1533-4406
                March 30 2023
                March 30 2023
                : 388
                : 13
                : 1233-1239
                Affiliations
                [1 ]From Microsoft Research, Redmond, WA (P.L., S.B.); and Nuance Communications, Burlington, MA (J.P.).
                Article
                10.1056/NEJMsr2214184
                36988602
                b83f1e74-4a47-4adc-8395-3e0aa92b3c21
                © 2023
                History

                Comments

                Comment on this article