Machine Learning Approaches to Predict 6-Month Mortality Among Patients With Cancer

Parikh, Ravi B; Manz, Christopher; Chivers, Corey; Regli, Susan Harkness; Braun, Jennifer; Draugelis, Michael E; Schuchter, Lynn M; Shulman, Lawrence N.; Navathe, Amol S; Patel, Mitesh S; O’Connor, Nina R.

doi:10.1001/jamanetworkopen.2019.15997

ScienceOpen: research and publishing network

For Publishers

For Researchers

Blog
About

Search
Advanced search

views

recommends

Record: found
Abstract: found
Article: found

Is Open Access

Machine Learning Approaches to Predict 6-Month Mortality Among Patients With Cancer

research-article

Author(s): Ravi B. Parikh , MD, MPP ¹ ^, ² ^, ³ ^, ⁴ ^, ⁵ ^, , Christopher Manz , MD ¹ ^, ² ^, ³ , Corey Chivers , PhD ⁶ , Susan Harkness Regli , PhD ⁶ , Jennifer Braun , MHA ² , Michael E. Draugelis , MS ⁶ , Lynn M. Schuchter , MD ¹ ^, ² ^, ³ , Lawrence N. Shulman , MD ¹ ^, ² ^, ³ , Amol S. Navathe , MD, PhD ¹ ^, ⁴ ^, ⁵ , Mitesh S. Patel , MD, MBA ¹ ^, ⁵ , Nina R. O’Connor , MD ¹ ^, ²

Publication date (Electronic): 25 October 2019

Journal: JAMA Network Open

Publisher: American Medical Association

Read this article at

ScienceOpen Publisher PMC

Bookmark

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Key Points

Question

Can machine learning algorithms identify oncology patients at risk of short-term mortality to inform timely conversations between patients and physicians regrading serious illness?

Findings

In this cohort study of 26 525 patients seen in oncology practices within a large academic health system, machine learning algorithms accurately identified patients at high risk of 6-month mortality with good discrimination and positive predictive value. When the gradient boosting algorithm was applied in real time, most patients who were classified as having high risk were deemed appropriate by oncology clinicians for a conversation regarding serious illness.

Meaning

In this study, machine learning algorithms accurately identified patients with cancer who were at risk of 6-month mortality, suggesting that these models could facilitate more timely conversations between patients and physicians regarding goals and values.

Abstract

This cohort study develops, validates, and compares machine learning algorithms that use structured electronic health record data before a clinic visit to predict mortality among patients with cancer.

Abstract

Importance

Machine learning algorithms could identify patients with cancer who are at risk of short-term mortality. However, it is unclear how different machine learning algorithms compare and whether they could prompt clinicians to have timely conversations about treatment and end-of-life preferences.

Objectives

To develop, validate, and compare machine learning algorithms that use structured electronic health record data before a clinic visit to predict mortality among patients with cancer.

Design, Setting, and Participants

Cohort study of 26 525 adult patients who had outpatient oncology or hematology/oncology encounters at a large academic cancer center and 10 affiliated community practices between February 1, 2016, and July 1, 2016. Patients were not required to receive cancer-directed treatment. Patients were observed for up to 500 days after the encounter. Data analysis took place between October 1, 2018, and September 1, 2019.

Exposures

Logistic regression, gradient boosting, and random forest algorithms.

Main Outcomes and Measures

Primary outcome was 180-day mortality from the index encounter; secondary outcome was 500-day mortality from the index encounter.

Results

Among 26 525 patients in the analysis, 1065 (4.0%) died within 180 days of the index encounter. Among those who died, the mean age was 67.3 (95% CI, 66.5-68.0) years, and 500 (47.0%) were women. Among those who were alive at 180 days, the mean age was 61.3 (95% CI, 61.1-61.5) years, and 15 922 (62.5%) were women. The population was randomly partitioned into training (18 567 [70.0%]) and validation (7958 [30.0%]) cohorts at the patient level, and a randomly selected encounter was included in either the training or validation set. At a prespecified alert rate of 0.02, positive predictive values were higher for the random forest (51.3%) and gradient boosting (49.4%) algorithms compared with the logistic regression algorithm (44.7%). There was no significant difference in discrimination among the random forest (area under the receiver operating characteristic curve [AUC], 0.88; 95% CI, 0.86-0.89), gradient boosting (AUC, 0.87; 95% CI, 0.85-0.89), and logistic regression (AUC, 0.86; 95% CI, 0.84-0.88) models ( P for comparison = .02). In the random forest model, observed 180-day mortality was 51.3% (95% CI, 43.6%-58.8%) in the high-risk group vs 3.4% (95% CI, 3.0%-3.8%) in the low-risk group; at 500 days, observed mortality was 64.4% (95% CI, 56.7%-71.4%) in the high-risk group and 7.6% (7.0%-8.2%) in the low-risk group. In a survey of 15 oncology clinicians with a 52.1% response rate, 100 of 171 patients (58.8%) who had been flagged as having high risk by the gradient boosting algorithm were deemed appropriate for a conversation about treatment and end-of-life preferences in the upcoming week.

Conclusions and Relevance

In this cohort study, machine learning algorithms based on structured electronic health record data accurately identified patients with cancer at risk of short-term mortality. When the gradient boosting algorithm was applied in real time, clinicians believed that most patients who had been identified as having high risk were appropriate for a timely conversation about treatment and end-of-life preferences.

Related collections

Most cited references 18

Record: found
Abstract: found
Article: not found

The Palliative Prognostic Index: a scoring system for survival prediction of terminally ill cancer patients.

S. Inoue, S Chihara, J Tsunoda … (1999)

Although accurate prediction of survival is essential for palliative care, few clinical methods of determining how long a patient is likely to live have been established. To develop a validated scoring system for survival prediction, a retrospective cohort study was performed with a training-testing procedure on two independent series of terminally ill cancer patients. Performance status (PS) and clinical symptoms were assessed prospectively. In the training set (355 assessments on 150 patients) the Palliative Prognostic Index (PPI) was defined by PS, oral intake, edema, dyspnea at rest, and delirium. In the testing sample (233 assessments on 95 patients) the predictive values of this scoring system were examined. In the testing set, patients were classified into three groups: group A (PPI 4.0). Group B survived significantly longer than group C, and group A survived significantly longer than either of the others. Also, when a PPI of more than 6 was adopted as a cut-off point, 3 weeks' survival was predicted with a sensitivity of 80% and a specificity of 85%. When a PPI of more than 4 was used as a cutoff point, 6 weeks' survival was predicted with a sensitivity of 80% and a specificity of 77%. In conclusion, whether patients live longer than 3 or 6 weeks can be acceptably predicted by PPI.

0 comments Cited 142 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Machine learning algorithms outperform conventional regression models in predicting development of hepatocellular carcinoma.

J Marrero, Ji Zhu, U A Mukherjee … (2013)

Predictive models for hepatocellular carcinoma (HCC) have been limited by modest accuracy and lack of validation. Machine-learning algorithms offer a novel methodology, which may improve HCC risk prognostication among patients with cirrhosis. Our study's aim was to develop and compare predictive models for HCC development among cirrhotic patients, using conventional regression analysis and machine-learning algorithms. We enrolled 442 patients with Child A or B cirrhosis at the University of Michigan between January 2004 and September 2006 (UM cohort) and prospectively followed them until HCC development, liver transplantation, death, or study termination. Regression analysis and machine-learning algorithms were used to construct predictive models for HCC development, which were tested on an independent validation cohort from the Hepatitis C Antiviral Long-term Treatment against Cirrhosis (HALT-C) Trial. Both models were also compared with the previously published HALT-C model. Discrimination was assessed using receiver operating characteristic curve analysis, and diagnostic accuracy was assessed with net reclassification improvement and integrated discrimination improvement statistics. After a median follow-up of 3.5 years, 41 patients developed HCC. The UM regression model had a c-statistic of 0.61 (95% confidence interval (CI) 0.56-0.67), whereas the machine-learning algorithm had a c-statistic of 0.64 (95% CI 0.60-0.69) in the validation cohort. The HALT-C model had a c-statistic of 0.60 (95% CI 0.50-0.70) in the validation cohort and was outperformed by the machine-learning algorithm. The machine-learning algorithm had significantly better diagnostic accuracy as assessed by net reclassification improvement (P<0.001) and integrated discrimination improvement (P=0.04). Machine-learning algorithms improve the accuracy of risk stratifying patients with cirrhosis and can be used to accurately identify patients at high-risk for developing HCC.

0 comments Cited 119 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Article: not found

An analysis of four missing data treatment methods for supervised learning

Gustavo E. A. P. A. Batista, Maria Carolina Monard (2003)

0 comments Cited 93 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): JAMA Netw Open

Journal ID (iso-abbrev): JAMA Netw Open

Journal ID (pmc): JAMA Netw Open

Title: JAMA Network Open

Publisher: American Medical Association

ISSN (Electronic): 2574-3805

Publication date (Electronic): 25 October 2019

Publication date Collection: October 2019

Publication date PMC-release: 25 October 2019

Volume: 2

Issue: 10

Electronic Location Identifier: e1915997

Affiliations

[1 ]Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia

[2 ]Abramson Cancer Center, University of Pennsylvania, Philadelphia

[3 ]Penn Center for Cancer Care Innovation, University of Pennsylvania, Philadelphia

[4 ]Department of Medical Ethics and Health Policy, University of Pennsylvania, Philadelphia

[5 ]Corporal Michael J. Crescenz VA Medical Center, Philadelphia, Pennsylvania

[6 ]Penn Medicine, University of Pennsylvania, Philadelphia

Author notes

Article Information

Accepted for Publication: October 4, 2019.

Published: October 25, 2019. doi:10.1001/jamanetworkopen.2019.15997

Corresponding Author: Ravi B. Parikh, MD, MPP, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, 423 Guardian Dr, Blockley 1102, Philadelphia, PA 19104 ( ravi.parikh@ 123456pennmedicine.upenn.edu ).

Author Contributions: Drs Parikh and Chivers had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. Drs Parikh and Manz contributed equally to this work.

Concept and design: Parikh, Manz, Chivers, Regli, Draugelis, Shulman, O’Connor.

Acquisition, analysis, or interpretation of data: Parikh, Manz, Chivers, Braun, Schuchter, Shulman, Navathe, Patel, O’Connor.

Drafting of the manuscript: Parikh, Manz, Chivers.

Critical revision of the manuscript for important intellectual content: All authors.

Statistical analysis: Parikh, Manz, Chivers.

Obtained funding: Parikh.

Administrative, technical, or material support: Parikh, Manz, Chivers, Regli, Draugelis, Shulman, Patel, O’Connor.

Supervision: Parikh, Regli, Braun, Schuchter, Shulman, Navathe, O’Connor.

Conflict of Interest Disclosures: Dr Parikh reported receiving personal fees from GNS Healthcare; grants from Conquer Cancer Foundation, the Veterans Affairs Center for Health Equity Research and Promotion, and the Penn Center for Precision Medicine; and support from the Medical University of South Carolina Transdisciplinary Collaborative Center in Precision Medicine and Minority Men’s Health outside the submitted work. Dr Navathe reported receiving grants from Hawaii Medical Services Association, Anthem Public Policy Institute, the Commonwealth Fund, Oscar Health, Cigna Corporation, the Robert Wood Johnson Foundation, and the Donaghue Foundation; serving as an advisor for Navvis Healthcare and Agathos Inc; serving as an advisor and receiving travel compensation from University Health System (Singapore); receiving an honorarium from Elsevier Press; receiving personal fees from Navahealth; receiving speaker fees and travel from the Cleveland Clinic; and serving as an uncompensated board member for Integrated Services, Inc outside the submitted work. Dr Patel reported being the owner of Catalyst Health LLC, a consulting firm; having stock options from and serving on the advisory board of LifeVest Health; having stock options from, serving on the advisory board of, and receiving personal fees from HealthMine Services; and receiving personal fees from and serving on the advisory board of Holistic Industries outside the submitted work. No other disclosures were reported.

Funding/Support: This work was supported by grant 5-T32-CA009615 to Dr Parikh from the National Institutes of Health and grant T32-GM075766-14 to Dr Manz from the National Institutes of Health. Drs Parikh and Manz were supported by the Penn Center for Precision Medicine.

Role of the Funder/Sponsor: The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

Meeting Presentation: This article was presented at the Supportive Care in Oncology Symposium of the American Society of Clinical Oncology; October 25, 2019; San Francisco, California.

Article

Publisher ID: zoi190606

DOI: 10.1001/jamanetworkopen.2019.15997

PMC ID: 6822091

PubMed ID: 31651973

SO-VID: 033463a1-ead9-46a7-8a08-0e54791f2aab

License:

This is an open access article distributed under the terms of the CC-BY License.

History

Date received : 29 May 2019

Date accepted : 4 October 2019

Funding

Funded by: National Institutes of Health

Funded by: Penn Center for Precision Medicine

Comments

Comment on this article

scite_

Smart Citations

Citing PublicationsSupportingMentioningContrasting

View Citations

See how this article has been cited at scite.ai

scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

Machine Learning Approaches to Predict 6-Month Mortality Among Patients With Cancer

Read this article at

Key Points

Question

Findings

Meaning

Abstract

Abstract

Importance

Objectives

Design, Setting, and Participants

Exposures

Main Outcomes and Measures

Results

Conclusions and Relevance

Related collections

Annual Reviews AI, Machine Learning, and Society

Most cited references 18

The Palliative Prognostic Index: a scoring system for survival prediction of terminally ill cancer patients.

Machine learning algorithms outperform conventional regression models in predicting development of hepatocellular carcinoma.

An analysis of four missing data treatment methods for supervised learning

Author and article information

Journal

Affiliations

Author notes

Article

History

Funding

Categories

Comments

Comment on this article

Similar content 290

Cited by 80

Most referenced authors 1,290