29
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Large language models to identify social determinants of health in electronic health records

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Social determinants of health (SDoH) play a critical role in patient outcomes, yet their documentation is often missing or incomplete in the structured data of electronic health records (EHRs). Large language models (LLMs) could enable high-throughput extraction of SDoH from the EHR to support research and clinical care. However, class imbalance and data limitations present challenges for this sparsely documented yet critical information. Here, we investigated the optimal methods for using LLMs to extract six SDoH categories from narrative text in the EHR: employment, housing, transportation, parental status, relationship, and social support. The best-performing models were fine-tuned Flan-T5 XL for any SDoH mentions (macro-F1 0.71), and Flan-T5 XXL for adverse SDoH mentions (macro-F1 0.70). Adding LLM-generated synthetic data to training varied across models and architecture, but improved the performance of smaller Flan-T5 models (delta F1 + 0.12 to +0.23). Our best-fine-tuned models outperformed zero- and few-shot performance of ChatGPT-family models in the zero- and few-shot setting, except GPT4 with 10-shot prompting for adverse SDoH. Fine-tuned models were less likely than ChatGPT to change their prediction when race/ethnicity and gender descriptors were added to the text, suggesting less algorithmic bias ( p < 0.05). Our models identified 93.8% of patients with adverse SDoH, while ICD-10 codes captured 2.0%. These results demonstrate the potential of LLMs in improving real-world evidence on SDoH and assisting in identifying patients who could benefit from resource support.

          Related collections

          Most cited references52

          • Record: found
          • Abstract: found
          • Article: not found

          PhysioBank, PhysioToolkit, and PhysioNet

          Circulation, 101(23)
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            MIMIC-III, a freely accessible critical care database

            MIMIC-III (‘Medical Information Mart for Intensive Care’) is a large, single-center database comprising information relating to patients admitted to critical care units at a large tertiary care hospital. Data includes vital signs, medications, laboratory measurements, observations and notes charted by care providers, fluid balance, procedure codes, diagnostic codes, imaging reports, hospital length of stay, survival data, and more. The database supports applications including academic and industrial research, quality improvement initiatives, and higher education coursework.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              The lifelong effects of early childhood adversity and toxic stress.

              Advances in fields of inquiry as diverse as neuroscience, molecular biology, genomics, developmental psychology, epidemiology, sociology, and economics are catalyzing an important paradigm shift in our understanding of health and disease across the lifespan. This converging, multidisciplinary science of human development has profound implications for our ability to enhance the life prospects of children and to strengthen the social and economic fabric of society. Drawing on these multiple streams of investigation, this report presents an ecobiodevelopmental framework that illustrates how early experiences and environmental influences can leave a lasting signature on the genetic predispositions that affect emerging brain architecture and long-term health. The report also examines extensive evidence of the disruptive impacts of toxic stress, offering intriguing insights into causal mechanisms that link early adversity to later impairments in learning, behavior, and both physical and mental well-being. The implications of this framework for the practice of medicine, in general, and pediatrics, specifically, are potentially transformational. They suggest that many adult diseases should be viewed as developmental disorders that begin early in life and that persistent health disparities associated with poverty, discrimination, or maltreatment could be reduced by the alleviation of toxic stress in childhood. An ecobiodevelopmental framework also underscores the need for new thinking about the focus and boundaries of pediatric practice. It calls for pediatricians to serve as both front-line guardians of healthy child development and strategically positioned, community leaders to inform new science-based strategies that build strong foundations for educational achievement, economic productivity, responsible citizenship, and lifelong health.
                Bookmark

                Author and article information

                Contributors
                Danielle_Bitterman@dfci.harvard.edu
                Journal
                NPJ Digit Med
                NPJ Digit Med
                NPJ Digital Medicine
                Nature Publishing Group UK (London )
                2398-6352
                11 January 2024
                11 January 2024
                2024
                : 7
                : 6
                Affiliations
                [1 ]GRID grid.38142.3c, ISNI 000000041936754X, Artificial Intelligence in Medicine (AIM) Program, , Mass General Brigham, Harvard Medical School, ; Boston, MA USA
                [2 ]Department of Radiation Oncology, Brigham and Women’s Hospital/Dana-Farber Cancer Institute, ( https://ror.org/04b6nzv94) Boston, MA USA
                [3 ]GRID grid.38142.3c, ISNI 000000041936754X, Computational Health Informatics Program, , Boston Children’s Hospital, Harvard Medical School, ; Boston, MA USA
                [4 ]Adult Resource Office, Dana-Farber Cancer Institute, ( https://ror.org/02jzgtq86) Boston, MA USA
                [5 ]Radiology and Nuclear Medicine, GROW & CARIM, Maastricht University, ( https://ror.org/02jz4aj89) Maastricht, The Netherlands
                [6 ]Department of Data Science, Dana-Farber Cancer Institute and Department of Biostatistics, Harvard T. H. Chan School of Public Health, ( https://ror.org/02jzgtq86) Boston, MA USA
                Author information
                http://orcid.org/0000-0001-7999-7410
                http://orcid.org/0000-0002-4313-2754
                http://orcid.org/0000-0002-2122-2003
                Article
                970
                10.1038/s41746-023-00970-0
                10781957
                38200151
                7298ce02-5fa4-4c1e-8d10-28fe3d299448
                © The Author(s) 2024

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 14 August 2023
                : 15 November 2023
                Funding
                Funded by: FundRef https://doi.org/10.13039/100000054, U.S. Department of Health & Human Services | NIH | National Cancer Institute (NCI);
                Award ID: U54CA274516-01A1
                Award ID: U54CA274516-01A1
                Award ID: U54CA274516-01A1
                Award ID: U54CA274516-01A1
                Award ID: R35CA22052
                Award ID: U54CA274516-01A1
                Award ID: U54CA274516-01A1
                Award Recipient :
                Funded by: U.S. Department of Health & Human Services | NIH | National Cancer Institute (NCI)
                Funded by: U.S. Department of Health & Human Services | NIH | National Cancer Institute (NCI)
                Funded by: FundRef https://doi.org/10.13039/100006098, Radiological Society of North America (RSNA);
                Funded by: FundRef https://doi.org/10.13039/100000982, Conquer Cancer Foundation (Conquer Cancer Foundation of the American Society of Clinical Oncology);
                Funded by: U.S. Department of Health & Human Services | NIH | National Cancer Institute (NCI)
                Funded by: U.S. Department of Health & Human Services | NIH | National Cancer Institute (NCI)
                Funded by: U.S. Department of Health & Human Services | NIH | National Cancer Institute (NCI)
                Funded by: FundRef https://doi.org/10.13039/100000092, U.S. Department of Health & Human Services | NIH | U.S. National Library of Medicine (NLM);
                Award ID: R01LM013486
                Award Recipient :
                Funded by: U.S. Department of Health & Human Services | NIH | National Cancer Institute (NCI)
                Categories
                Article
                Custom metadata
                © Springer Nature Limited 2024

                health care,machine learning
                health care, machine learning

                Comments

                Comment on this article

                scite_
                0
                0
                0
                0
                Smart Citations
                0
                0
                0
                0
                Citing PublicationsSupportingMentioningContrasting
                View Citations

                See how this article has been cited at scite.ai

                scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

                Similar content230

                Cited by36

                Most referenced authors857