Inviting an author to review:
Find an author and click ‘Invite to review selected article’ near their name.
Search for authorsSearch for similar articles
53
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Assessing the Accuracy, Completeness, and Reliability of Artificial Intelligence-Generated Responses in Dentistry: A Pilot Study Evaluating the ChatGPT Model

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background: Artificial intelligence (AI) can be a tool in the diagnosis and acquisition of knowledge, particularly in dentistry, sparking debates on its application in clinical decision-making.

          Objective: This study aims to evaluate the accuracy, completeness, and reliability of the responses generated by Chatbot Generative Pre-Trained Transformer (ChatGPT) 3.5 in dentistry using expert-formulated questions.

          Materials and methods: Experts were invited to create three questions, answers, and respective references according to specialized fields of activity. The Likert scale was used to evaluate agreement levels between experts and ChatGPT responses. Statistical analysis compared descriptive and binary question groups in terms of accuracy and completeness. Questions with low accuracy underwent re-evaluation, and subsequent responses were compared for improvement. The Wilcoxon test was utilized (α = 0.05).

          Results: Ten experts across six dental specialties generated 30 binary and descriptive dental questions and references. The accuracy score had a median of 5.50 and a mean of 4.17. For completeness, the median was 2.00 and the mean was 2.07. No difference was observed between descriptive and binary responses for accuracy and completeness. However, re-evaluated responses showed a significant improvement with a significant difference in accuracy (median 5.50 vs. 6.00; mean 4.17 vs. 4.80; p=0.042) and completeness (median 2.0 vs. 2.0; mean 2.07 vs. 2.30; p=0.011). References were more incorrect than correct, with no differences between descriptive and binary questions.

          Conclusions: ChatGPT initially demonstrated good accuracy and completeness, which was further improved with machine learning (ML) over time. However, some inaccurate answers and references persisted. Human critical discernment continues to be essential to facing complex clinical cases and advancing theoretical knowledge and evidence-based practice.

          Related collections

          Most cited references30

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Chatbot for Health Care and Oncology Applications Using Artificial Intelligence and Machine Learning: Systematic Review

          Background Chatbot is a timely topic applied in various fields, including medicine and health care, for human-like knowledge transfer and communication. Machine learning, a subset of artificial intelligence, has been proven particularly applicable in health care, with the ability for complex dialog management and conversational flexibility. Objective This review article aims to report on the recent advances and current trends in chatbot technology in medicine. A brief historical overview, along with the developmental progress and design characteristics, is first introduced. The focus will be on cancer therapy, with in-depth discussions and examples of diagnosis, treatment, monitoring, patient support, workflow efficiency, and health promotion. In addition, this paper will explore the limitations and areas of concern, highlighting ethical, moral, security, technical, and regulatory standards and evaluation issues to explain the hesitancy in implementation. Methods A search of the literature published in the past 20 years was conducted using the IEEE Xplore, PubMed, Web of Science, Scopus, and OVID databases. The screening of chatbots was guided by the open-access Botlist directory for health care components and further divided according to the following criteria: diagnosis, treatment, monitoring, support, workflow, and health promotion. Results Even after addressing these issues and establishing the safety or efficacy of chatbots, human elements in health care will not be replaceable. Therefore, chatbots have the potential to be integrated into clinical practice by working alongside health practitioners to reduce costs, refine workflow efficiencies, and improve patient outcomes. Other applications in pandemic support, global health, and education are yet to be fully explored. Conclusions Further research and interdisciplinary collaboration could advance this technology to dramatically improve the quality of care for patients, rebalance the workload for clinicians, and revolutionize the practice of medicine.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Applicability of ChatGPT in Assisting to Solve Higher Order Problems in Pathology

            Background Artificial intelligence (AI) is evolving for healthcare services. Higher cognitive thinking in AI refers to the ability of the system to perform advanced cognitive processes, such as problem-solving, decision-making, reasoning, and perception. This type of thinking goes beyond simple data processing and involves the ability to understand and manipulate abstract concepts, interpret, and use information in a contextually relevant way, and generate new insights based on past experiences and accumulated knowledge. Natural language processing models like ChatGPT is a conversational program that can interact with humans to provide answers to queries. Objective We aimed to ascertain the capability of ChatGPT in solving higher-order reasoning in the subject of pathology. Methods This cross-sectional study was conducted on the internet using an AI-based chat program that provides free service for research purposes. The current version of ChatGPT (January 30 version) was used to converse with a total of 100 higher-order reasoning queries. These questions were randomly selected from the question bank of the institution and categorized according to different systems. The responses to each question were collected and stored for further analysis. The responses were evaluated by three expert pathologists on a zero to five scale and categorized into the structure of the observed learning outcome (SOLO) taxonomy categories. The score was compared by a one-sample median test with hypothetical values to find its accuracy. Result A total of 100 higher-order reasoning questions were solved by the program in an average of 45.31±7.14 seconds for an answer. The overall median score was 4.08 (Q1-Q3: 4-4.33) which was below the hypothetical maximum value of five (one-test median test p <0.0001) and similar to four (one-test median test p = 0.14). The majority (86%) of the responses were in the “relational” category in the SOLO taxonomy. There was no difference in the scores of the responses for questions asked from various organ systems in the subject of Pathology (Kruskal Wallis p = 0.55). The scores rated by three pathologists had an excellent level of inter-rater reliability (ICC = 0.975 [95% CI: 0.965-0.983]; F = 40.26; p < 0.0001). Conclusion The capability of ChatGPT to solve higher-order reasoning questions in pathology had a relational level of accuracy. Hence, the text output had connections among its parts to provide a meaningful response. The answers from the program can score approximately 80%. Hence, academicians or students can get help from the program for solving reasoning-type questions also. As the program is evolving, further studies are needed to find its accuracy level in any further versions.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Assessing the Capability of ChatGPT in Answering First- and Second-Order Knowledge Questions on Microbiology as per Competency-Based Medical Education Curriculum

              Background and objective ChatGPT is an artificial intelligence (AI) language model that has been trained to process and respond to questions across a wide range of topics. It is also capable of solving problems in medical educational topics. However, the capability of ChatGPT to accurately answer first- and second-order knowledge questions in the field of microbiology has not been explored so far. Hence, in this study, we aimed to analyze the capability of ChatGPT in answering first- and second-order questions on the subject of microbiology. Materials and methods Based on the competency-based medical education (CBME) curriculum of the subject of microbiology, we prepared a set of first-order and second-order questions. For the total of eight modules in the CBME curriculum for microbiology, we prepared six first-order and six second-order knowledge questions according to the National Medical Commission-recommended CBME curriculum, amounting to a total of (8 x 12) 96 questions. The questions were checked for content validity by three expert microbiologists. These questions were used to converse with ChatGPT by a single user and responses were recorded for further analysis. The answers were scored by three microbiologists on a rating scale of 0-5. The average of three scores was taken as the final score for analysis. As the data were not normally distributed, we used a non-parametric statistical test. The overall scores were tested by a one-sample median test with hypothetical values of 4 and 5. The scores of answers to first-order and second-order questions were compared by the Mann-Whitney U test. Module-wise responses were tested by the Kruskall-Wallis test followed by the post hoc test for pairwise comparisons. Results The overall score of 96 answers was 4.04 ±0.37 (median: 4.17, Q1-Q3: 3.88-4.33) with the mean score of answers to first-order knowledge questions being 4.07 ±0.32 (median: 4.17, Q1-Q3: 4-4.33) and that of answers to second-order knowledge questions being 3.99 ±0.43 (median: 4, Q1-Q3: 3.67-4.33) (Mann-Whitney p=0.4). The score was significantly below the score of 5 (one-sample median test p<0.0001) and similar to 4 (one-sample median test p=0.09). Overall, there was a variation in median scores obtained in eight categories of topics in microbiology, indicating inconsistent performance in different topics. Conclusion The results of the study indicate that ChatGPT is capable of answering both first- and second-order knowledge questions related to the subject of microbiology. The model achieved an accuracy of approximately 80% and there was no difference between the model's capability of answering first-order questions and second-order knowledge questions. The findings of this study suggest that ChatGPT has the potential to be an effective tool for automated question-answering in the field of microbiology. However, continued improvement in the training and development of language models is necessary to enhance their performance and make them suitable for academic use.
                Bookmark

                Author and article information

                Journal
                Cureus
                Cureus
                2168-8184
                Cureus
                Cureus (Palo Alto (CA) )
                2168-8184
                29 July 2024
                July 2024
                : 16
                : 7
                : e65658
                Affiliations
                [1 ] Department of Pediatric Dentistry, School of Dentistry of Ribeirão Preto at University of São Paulo, Ribeirão Preto, BRA
                [2 ] Department of Dental Materials and Prosthesis, School of Dentistry of Ribeirão Preto at University of São Paulo, Ribeirão Preto, BRA
                [3 ] Department of Public Health, University of Illinois Chicago at College of Dentistry, Chicago, USA
                [4 ] Department of Pediatric Dentistry, School of Dentistry of Ribeirão Preto at University of São Paulo, Ribeirão Preto, USA
                [5 ] Department of Dentistry, School of Dentistry of Ribeirão Preto at University of São Paulo, São Paulo, BRA
                [6 ] Department of Restorative Dentistry, University of Illinois Chicago at College of Dentistry, Chicago, USA
                [7 ] Department of Pediatric Dentistry, University of Illinois Chicago College of Dentistry, Chicago, USA
                Author notes
                Article
                10.7759/cureus.65658
                11352766
                362060e7-6f0e-42b7-a3ee-59841e537075
                Copyright © 2024, Molena et al.

                This is an open access article distributed under the terms of the Creative Commons Attribution License CC-BY 4.0., which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 29 July 2024
                Categories
                Dentistry
                Healthcare Technology

                decision-making process,evidence base practice,knowledge acquisition,ai and machine learning,chat-gpt,decision-support tools,artificial intelligence in dentistry

                Comments

                Comment on this article

                scite_
                0
                0
                0
                0
                Smart Citations
                0
                0
                0
                0
                Citing PublicationsSupportingMentioningContrasting
                View Citations

                See how this article has been cited at scite.ai

                scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

                Similar content319

                Most referenced authors175