Search for authorsSearch for similar articles
15
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Machine Learning Approaches to Predict 6-Month Mortality Among Patients With Cancer

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Key Points

          Question

          Can machine learning algorithms identify oncology patients at risk of short-term mortality to inform timely conversations between patients and physicians regrading serious illness?

          Findings

          In this cohort study of 26 525 patients seen in oncology practices within a large academic health system, machine learning algorithms accurately identified patients at high risk of 6-month mortality with good discrimination and positive predictive value. When the gradient boosting algorithm was applied in real time, most patients who were classified as having high risk were deemed appropriate by oncology clinicians for a conversation regarding serious illness.

          Meaning

          In this study, machine learning algorithms accurately identified patients with cancer who were at risk of 6-month mortality, suggesting that these models could facilitate more timely conversations between patients and physicians regarding goals and values.

          Abstract

          This cohort study develops, validates, and compares machine learning algorithms that use structured electronic health record data before a clinic visit to predict mortality among patients with cancer.

          Abstract

          Importance

          Machine learning algorithms could identify patients with cancer who are at risk of short-term mortality. However, it is unclear how different machine learning algorithms compare and whether they could prompt clinicians to have timely conversations about treatment and end-of-life preferences.

          Objectives

          To develop, validate, and compare machine learning algorithms that use structured electronic health record data before a clinic visit to predict mortality among patients with cancer.

          Design, Setting, and Participants

          Cohort study of 26 525 adult patients who had outpatient oncology or hematology/oncology encounters at a large academic cancer center and 10 affiliated community practices between February 1, 2016, and July 1, 2016. Patients were not required to receive cancer-directed treatment. Patients were observed for up to 500 days after the encounter. Data analysis took place between October 1, 2018, and September 1, 2019.

          Exposures

          Logistic regression, gradient boosting, and random forest algorithms.

          Main Outcomes and Measures

          Primary outcome was 180-day mortality from the index encounter; secondary outcome was 500-day mortality from the index encounter.

          Results

          Among 26 525 patients in the analysis, 1065 (4.0%) died within 180 days of the index encounter. Among those who died, the mean age was 67.3 (95% CI, 66.5-68.0) years, and 500 (47.0%) were women. Among those who were alive at 180 days, the mean age was 61.3 (95% CI, 61.1-61.5) years, and 15 922 (62.5%) were women. The population was randomly partitioned into training (18 567 [70.0%]) and validation (7958 [30.0%]) cohorts at the patient level, and a randomly selected encounter was included in either the training or validation set. At a prespecified alert rate of 0.02, positive predictive values were higher for the random forest (51.3%) and gradient boosting (49.4%) algorithms compared with the logistic regression algorithm (44.7%). There was no significant difference in discrimination among the random forest (area under the receiver operating characteristic curve [AUC], 0.88; 95% CI, 0.86-0.89), gradient boosting (AUC, 0.87; 95% CI, 0.85-0.89), and logistic regression (AUC, 0.86; 95% CI, 0.84-0.88) models ( P for comparison = .02). In the random forest model, observed 180-day mortality was 51.3% (95% CI, 43.6%-58.8%) in the high-risk group vs 3.4% (95% CI, 3.0%-3.8%) in the low-risk group; at 500 days, observed mortality was 64.4% (95% CI, 56.7%-71.4%) in the high-risk group and 7.6% (7.0%-8.2%) in the low-risk group. In a survey of 15 oncology clinicians with a 52.1% response rate, 100 of 171 patients (58.8%) who had been flagged as having high risk by the gradient boosting algorithm were deemed appropriate for a conversation about treatment and end-of-life preferences in the upcoming week.

          Conclusions and Relevance

          In this cohort study, machine learning algorithms based on structured electronic health record data accurately identified patients with cancer at risk of short-term mortality. When the gradient boosting algorithm was applied in real time, clinicians believed that most patients who had been identified as having high risk were appropriate for a timely conversation about treatment and end-of-life preferences.

          Related collections

          Most cited references18

          • Record: found
          • Abstract: found
          • Article: not found

          The Palliative Prognostic Index: a scoring system for survival prediction of terminally ill cancer patients.

          Although accurate prediction of survival is essential for palliative care, few clinical methods of determining how long a patient is likely to live have been established. To develop a validated scoring system for survival prediction, a retrospective cohort study was performed with a training-testing procedure on two independent series of terminally ill cancer patients. Performance status (PS) and clinical symptoms were assessed prospectively. In the training set (355 assessments on 150 patients) the Palliative Prognostic Index (PPI) was defined by PS, oral intake, edema, dyspnea at rest, and delirium. In the testing sample (233 assessments on 95 patients) the predictive values of this scoring system were examined. In the testing set, patients were classified into three groups: group A (PPI 4.0). Group B survived significantly longer than group C, and group A survived significantly longer than either of the others. Also, when a PPI of more than 6 was adopted as a cut-off point, 3 weeks' survival was predicted with a sensitivity of 80% and a specificity of 85%. When a PPI of more than 4 was used as a cutoff point, 6 weeks' survival was predicted with a sensitivity of 80% and a specificity of 77%. In conclusion, whether patients live longer than 3 or 6 weeks can be acceptably predicted by PPI.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Machine learning algorithms outperform conventional regression models in predicting development of hepatocellular carcinoma.

            Predictive models for hepatocellular carcinoma (HCC) have been limited by modest accuracy and lack of validation. Machine-learning algorithms offer a novel methodology, which may improve HCC risk prognostication among patients with cirrhosis. Our study's aim was to develop and compare predictive models for HCC development among cirrhotic patients, using conventional regression analysis and machine-learning algorithms. We enrolled 442 patients with Child A or B cirrhosis at the University of Michigan between January 2004 and September 2006 (UM cohort) and prospectively followed them until HCC development, liver transplantation, death, or study termination. Regression analysis and machine-learning algorithms were used to construct predictive models for HCC development, which were tested on an independent validation cohort from the Hepatitis C Antiviral Long-term Treatment against Cirrhosis (HALT-C) Trial. Both models were also compared with the previously published HALT-C model. Discrimination was assessed using receiver operating characteristic curve analysis, and diagnostic accuracy was assessed with net reclassification improvement and integrated discrimination improvement statistics. After a median follow-up of 3.5 years, 41 patients developed HCC. The UM regression model had a c-statistic of 0.61 (95% confidence interval (CI) 0.56-0.67), whereas the machine-learning algorithm had a c-statistic of 0.64 (95% CI 0.60-0.69) in the validation cohort. The HALT-C model had a c-statistic of 0.60 (95% CI 0.50-0.70) in the validation cohort and was outperformed by the machine-learning algorithm. The machine-learning algorithm had significantly better diagnostic accuracy as assessed by net reclassification improvement (P<0.001) and integrated discrimination improvement (P=0.04). Machine-learning algorithms improve the accuracy of risk stratifying patients with cirrhosis and can be used to accurately identify patients at high-risk for developing HCC.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              An analysis of four missing data treatment methods for supervised learning

                Bookmark

                Author and article information

                Journal
                JAMA Netw Open
                JAMA Netw Open
                JAMA Netw Open
                JAMA Network Open
                American Medical Association
                2574-3805
                25 October 2019
                October 2019
                25 October 2019
                : 2
                : 10
                : e1915997
                Affiliations
                [1 ]Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia
                [2 ]Abramson Cancer Center, University of Pennsylvania, Philadelphia
                [3 ]Penn Center for Cancer Care Innovation, University of Pennsylvania, Philadelphia
                [4 ]Department of Medical Ethics and Health Policy, University of Pennsylvania, Philadelphia
                [5 ]Corporal Michael J. Crescenz VA Medical Center, Philadelphia, Pennsylvania
                [6 ]Penn Medicine, University of Pennsylvania, Philadelphia
                Author notes
                Article Information
                Accepted for Publication: October 4, 2019.
                Published: October 25, 2019. doi:10.1001/jamanetworkopen.2019.15997
                Open Access: This is an open access article distributed under the terms of the CC-BY License. © 2019 Parikh RB et al. JAMA Network Open.
                Corresponding Author: Ravi B. Parikh, MD, MPP, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, 423 Guardian Dr, Blockley 1102, Philadelphia, PA 19104 ( ravi.parikh@ 123456pennmedicine.upenn.edu ).
                Author Contributions: Drs Parikh and Chivers had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. Drs Parikh and Manz contributed equally to this work.
                Concept and design: Parikh, Manz, Chivers, Regli, Draugelis, Shulman, O’Connor.
                Acquisition, analysis, or interpretation of data: Parikh, Manz, Chivers, Braun, Schuchter, Shulman, Navathe, Patel, O’Connor.
                Drafting of the manuscript: Parikh, Manz, Chivers.
                Critical revision of the manuscript for important intellectual content: All authors.
                Statistical analysis: Parikh, Manz, Chivers.
                Obtained funding: Parikh.
                Administrative, technical, or material support: Parikh, Manz, Chivers, Regli, Draugelis, Shulman, Patel, O’Connor.
                Supervision: Parikh, Regli, Braun, Schuchter, Shulman, Navathe, O’Connor.
                Conflict of Interest Disclosures: Dr Parikh reported receiving personal fees from GNS Healthcare; grants from Conquer Cancer Foundation, the Veterans Affairs Center for Health Equity Research and Promotion, and the Penn Center for Precision Medicine; and support from the Medical University of South Carolina Transdisciplinary Collaborative Center in Precision Medicine and Minority Men’s Health outside the submitted work. Dr Navathe reported receiving grants from Hawaii Medical Services Association, Anthem Public Policy Institute, the Commonwealth Fund, Oscar Health, Cigna Corporation, the Robert Wood Johnson Foundation, and the Donaghue Foundation; serving as an advisor for Navvis Healthcare and Agathos Inc; serving as an advisor and receiving travel compensation from University Health System (Singapore); receiving an honorarium from Elsevier Press; receiving personal fees from Navahealth; receiving speaker fees and travel from the Cleveland Clinic; and serving as an uncompensated board member for Integrated Services, Inc outside the submitted work. Dr Patel reported being the owner of Catalyst Health LLC, a consulting firm; having stock options from and serving on the advisory board of LifeVest Health; having stock options from, serving on the advisory board of, and receiving personal fees from HealthMine Services; and receiving personal fees from and serving on the advisory board of Holistic Industries outside the submitted work. No other disclosures were reported.
                Funding/Support: This work was supported by grant 5-T32-CA009615 to Dr Parikh from the National Institutes of Health and grant T32-GM075766-14 to Dr Manz from the National Institutes of Health. Drs Parikh and Manz were supported by the Penn Center for Precision Medicine.
                Role of the Funder/Sponsor: The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
                Meeting Presentation: This article was presented at the Supportive Care in Oncology Symposium of the American Society of Clinical Oncology; October 25, 2019; San Francisco, California.
                Article
                zoi190606
                10.1001/jamanetworkopen.2019.15997
                6822091
                31651973
                033463a1-ead9-46a7-8a08-0e54791f2aab
                Copyright 2019 Parikh RB et al. JAMA Network Open.

                This is an open access article distributed under the terms of the CC-BY License.

                History
                : 29 May 2019
                : 4 October 2019
                Funding
                Funded by: National Institutes of Health
                Funded by: National Institutes of Health
                Funded by: Penn Center for Precision Medicine
                Categories
                Research
                Original Investigation
                Online Only
                Oncology

                Comments

                Comment on this article

                scite_
                0
                0
                0
                0
                Smart Citations
                0
                0
                0
                0
                Citing PublicationsSupportingMentioningContrasting
                View Citations

                See how this article has been cited at scite.ai

                scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

                Similar content290

                Cited by80

                Most referenced authors1,290