16
views
0
recommends
+1 Recommend
1 collections
    0
    shares

      Submit your digital health research with an established publisher
      - celebrating 25 years of open access

      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A Preliminary Checklist (METRICS) to Standardize the Design and Reporting of Studies on Generative Artificial Intelligence–Based Models in Health Care Education and Practice: Development Study Involving a Literature Review

      research-article
      , MD, PhD 1 , 2 , 3 , , , PhD 4 , , MSc, PharmD 5
      (Reviewer), (Reviewer), (Reviewer)
      Interactive Journal of Medical Research
      JMIR Publications
      guidelines, evaluation, meaningful analytics, large language models, decision support

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Adherence to evidence-based practice is indispensable in health care. Recently, the utility of generative artificial intelligence (AI) models in health care has been evaluated extensively. However, the lack of consensus guidelines on the design and reporting of findings of these studies poses a challenge for the interpretation and synthesis of evidence.

          Objective

          This study aimed to develop a preliminary checklist to standardize the reporting of generative AI-based studies in health care education and practice.

          Methods

          A literature review was conducted in Scopus, PubMed, and Google Scholar. Published records with “ChatGPT,” “Bing,” or “Bard” in the title were retrieved. Careful examination of the methodologies employed in the included records was conducted to identify the common pertinent themes and the possible gaps in reporting. A panel discussion was held to establish a unified and thorough checklist for the reporting of AI studies in health care. The finalized checklist was used to evaluate the included records by 2 independent raters. Cohen κ was used as the method to evaluate the interrater reliability.

          Results

          The final data set that formed the basis for pertinent theme identification and analysis comprised a total of 34 records. The finalized checklist included 9 pertinent themes collectively referred to as METRICS (Model, Evaluation, Timing, Range/Randomization, Individual factors, Count, and Specificity of prompts and language). Their details are as follows: (1) Model used and its exact settings; (2) Evaluation approach for the generated content; (3) Timing of testing the model; (4) Transparency of the data source; (5) Range of tested topics; (6) Randomization of selecting the queries; (7) Individual factors in selecting the queries and interrater reliability; (8) Count of queries executed to test the model; and (9) Specificity of the prompts and language used. The overall mean METRICS score was 3.0 (SD 0.58). The tested METRICS score was acceptable, with the range of Cohen κ of 0.558 to 0.962 ( P<.001 for the 9 tested items). With classification per item, the highest average METRICS score was recorded for the “Model” item, followed by the “Specificity” item, while the lowest scores were recorded for the “Randomization” item (classified as suboptimal) and “Individual factors” item (classified as satisfactory).

          Conclusions

          The METRICS checklist can facilitate the design of studies guiding researchers toward best practices in reporting results. The findings highlight the need for standardized reporting algorithms for generative AI-based studies in health care, considering the variability observed in methodologies and reporting. The proposed METRICS checklist could be a preliminary helpful base to establish a universally accepted approach to standardize the design and reporting of generative AI-based studies in health care, which is a swiftly evolving research topic.

          Related collections

          Most cited references100

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          The PRISMA 2020 statement: an updated guideline for reporting systematic reviews

          The Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) statement, published in 2009, was designed to help systematic reviewers transparently report why the review was done, what the authors did, and what they found. Over the past decade, advances in systematic review methodology and terminology have necessitated an update to the guideline. The PRISMA 2020 statement replaces the 2009 statement and includes new reporting guidance that reflects advances in methods to identify, select, appraise, and synthesise studies. The structure and presentation of the items have been modified to facilitate implementation. In this article, we present the PRISMA 2020 27-item checklist, an expanded checklist that details reporting recommendations for each item, the PRISMA 2020 abstract checklist, and the revised flow diagrams for original and updated reviews.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies.

              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

              In the last few years, the deep learning (DL) computing paradigm has been deemed the Gold Standard in the machine learning (ML) community. Moreover, it has gradually become the most widely used computational approach in the field of ML, thus achieving outstanding results on several complex cognitive tasks, matching or even beating those provided by human performance. One of the benefits of DL is the ability to learn massive amounts of data. The DL field has grown fast in the last few years and it has been extensively used to successfully address a wide range of traditional applications. More importantly, DL has outperformed well-known ML techniques in many domains, e.g., cybersecurity, natural language processing, bioinformatics, robotics and control, and medical information processing, among many others. Despite it has been contributed several works reviewing the State-of-the-Art on DL, all of them only tackled one aspect of the DL, which leads to an overall lack of knowledge about it. Therefore, in this contribution, we propose using a more holistic approach in order to provide a more suitable starting point from which to develop a full understanding of DL. Specifically, this review attempts to provide a more comprehensive survey of the most important aspects of DL and including those enhancements recently added to the field. In particular, this paper outlines the importance of DL, presents the types of DL techniques and networks. It then presents convolutional neural networks (CNNs) which the most utilized DL network type and describes the development of CNNs architectures together with their main features, e.g., starting with the AlexNet network and closing with the High-Resolution network (HR.Net). Finally, we further present the challenges and suggested solutions to help researchers understand the existing research gaps. It is followed by a list of the major DL applications. Computational tools including FPGA, GPU, and CPU are summarized along with a description of their influence on DL. The paper ends with the evolution matrix, benchmark datasets, and summary and conclusion.
                Bookmark

                Author and article information

                Contributors
                Journal
                Interact J Med Res
                Interact J Med Res
                IJMR
                Interactive Journal of Medical Research
                JMIR Publications (Toronto, Canada )
                1929-073X
                2024
                15 February 2024
                : 13
                : e54704
                Affiliations
                [1 ] Department of Pathology, Microbiology and Forensic Medicine School of Medicine The University of Jordan Amman Jordan
                [2 ] Department of Clinical Laboratories and Forensic Medicine Jordan University Hospital Amman Jordan
                [3 ] Department of Translational Medicine Faculty of Medicine Lund University Malmo Sweden
                [4 ] Department of Clinical Pharmacy and Therapeutics Faculty of Pharmacy Applied Science Private University Amman Jordan
                [5 ] Department of Pharmacy Mediclinic Parkview Hospital Mediclinic Middle East Dubai United Arab Emirates
                Author notes
                Corresponding Author: Malik Sallam malik.sallam@ 123456ju.edu.jo
                Author information
                https://orcid.org/0000-0002-0165-9670
                https://orcid.org/0000-0002-7966-1172
                https://orcid.org/0000-0003-3273-524X
                Article
                v13i1e54704
                10.2196/54704
                10905357
                38276872
                47a5ea32-20dc-4bca-af3d-aefb0bf6d70d
                ©Malik Sallam, Muna Barakat, Mohammed Sallam. Originally published in the Interactive Journal of Medical Research (https://www.i-jmr.org/), 15.02.2024.

                This is an open-access article distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Interactive Journal of Medical Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.i-jmr.org/, as well as this copyright and license information must be included.

                History
                : 19 November 2023
                : 13 December 2023
                : 18 December 2023
                : 26 January 2024
                Categories
                Original Paper
                Original Paper

                guidelines,evaluation,meaningful analytics,large language models,decision support

                Comments

                Comment on this article