5
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Accuracy of a Generative Artificial Intelligence Model in a Complex Diagnostic Challenge

      1 , 1 , 1
      JAMA
      American Medical Association (AMA)

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          This study assesses the diagnostic accuracy of the Generative Pre-trained Transformer 4 (GPT-4) artificial intelligence (AI) model in a series of challenging cases.

          Related collections

          Most cited references5

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models

          We evaluated the performance of a large language model called ChatGPT on the United States Medical Licensing Exam (USMLE), which consists of three exams: Step 1, Step 2CK, and Step 3. ChatGPT performed at or near the passing threshold for all three exams without any specialized training or reinforcement. Additionally, ChatGPT demonstrated a high level of concordance and insight in its explanations. These results suggest that large language models may have the potential to assist with medical education, and potentially, clinical decision-making.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Differential diagnosis generators: an evaluation of currently available computer programs.

            Differential diagnosis (DDX) generators are computer programs that generate a DDX based on various clinical data. We identified evaluation criteria through consensus, applied these criteria to describe the features of DDX generators, and tested performance using cases from the New England Journal of Medicine (NEJM©) and the Medical Knowledge Self Assessment Program (MKSAP©). We first identified evaluation criteria by consensus. Then we performed Google® and Pubmed searches to identify DDX generators. To be included, DDX generators had to do the following: generate a list of potential diagnoses rather than text or article references; rank or indicate critical diagnoses that need to be considered or eliminated; accept at least two signs, symptoms or disease characteristics; provide the ability to compare the clinical presentations of diagnoses; and provide diagnoses in general medicine. The evaluation criteria were then applied to the included DDX generators. Lastly, the performance of the DDX generators was tested with findings from 20 test cases. Each case performance was scored one through five, with a score of five indicating presence of the exact diagnosis. Mean scores and confidence intervals were calculated. Twenty three programs were initially identified and four met the inclusion criteria. These four programs were evaluated using the consensus criteria, which included the following: input method; mobile access; filtering and refinement; lab values, medications, and geography as diagnostic factors; evidence based medicine (EBM) content; references; and drug information content source. The mean scores (95% Confidence Interval) from performance testing on a five-point scale were Isabel© 3.45 (2.53, 4.37), DxPlain® 3.45 (2.63-4.27), Diagnosis Pro® 2.65 (1.75-3.55) and PEPID™ 1.70 (0.71-2.69). The number of exact matches paralleled the mean score finding. Consensus criteria for DDX generator evaluation were developed. Application of these criteria as well as performance testing supports the use of DxPlain® and Isabel© over the other currently available DDX generators.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Reasoning Foundations of Medical Diagnosis : Symbolic logic, probability, and value theory aid our understanding of how physicians reason

                Bookmark

                Author and article information

                Journal
                JAMA
                JAMA
                American Medical Association (AMA)
                0098-7484
                June 15 2023
                Affiliations
                [1 ]Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts
                Article
                10.1001/jama.2023.8288
                37318797
                5f177a10-2e9d-4cf2-b42c-e9d8427aebf2
                © 2023
                History

                Comments

                Comment on this article