Development and Validation of a Risk Stratification Model Using Disease Severity Hierarchy for Mortality or Major Cardiovascular Event

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Key Points

Question

Does incorporating clinical domain knowledge regarding diseases, disease severity, and treatment pathways into machine learning improve risk stratification?

Findings

In this retrospective cohort study involving 51 969 patients, a new representation of patient data was developed and used to train machine learning models to predict mortality and major cardiovascular events. Results showed substantial improvement in prediction performance compared with traditional patient data representation methods.

Meaning

The findings of this study suggest that methods that can extract and represent the clinical knowledge contained in electronic medical records should be incorporated into machine learning models for use in clinical decision support systems.

Abstract

This cohort study introduces a new representation of patient data called disease severity hierarchy that leverages domain knowledge in a nested fashion to create subpopulations that share increasing amounts of clinical details suitable for risk prediction.

Abstract

Importance

Clinical domain knowledge about diseases and their comorbidities, severity, treatment pathways, and outcomes can facilitate diagnosis, enhance preventive strategies, and help create smart evidence-based practice guidelines.

Objective

To introduce a new representation of patient data called disease severity hierarchy that leverages domain knowledge in a nested fashion to create subpopulations that share increasing amounts of clinical details suitable for risk prediction.

Design, Setting, and Participants

This retrospective cohort study included 51 969 patients aged 45 to 85 years, with 10 674 patients who received primary care at the Mayo Clinic between January 2004 and December 2015 in the training cohort and 41 295 patients who received primary care at Fairview Health Services from January 2010 to December 2017 in the validation cohort. Data were analyzed from May 2018 to December 2019.

Main Outcomes and Measures

Several binary classification measures, including the area under the receiver operating characteristic curve (AUC), Gini score, sensitivity, and positive predictive value, were used to evaluate models predicting all-cause mortality and major cardiovascular events at ages 60, 65, 75, and 80 years.

Results

The mean (SD) age and proportions of women and white individuals were 59.4 (10.8) years, 6324 (59.3%) and 9804 (91.9%), respectively, in the training cohort and 57.4 (7.9) years, 21 975 (53.1%), and 37 653 (91.2%), respectively, in the validation cohort. During follow-up, 945 patients (8.9%) in the training cohort died, while 787 (7.4%) had major cardiovascular events. Models using the new representation achieved AUCs for predicting death in the training cohort at ages 60, 65, 75, and 80 years of 0.96 (95% CI, 0.94-0.97), 0.96 (95% CI, 0.95-0.98), 0.97 (95% CI, 0.96-0.98), and 0.98 (95% CI, 0.98-0.99), respectively, while standard methods achieved modest AUCs of 0.67 (95% CI, 0.55-0.80), 0.66 (95% CI, 0.56-0.79), 0.64 (95% CI, 0.57-0.71), and 0.63 (95% CI, 0.54-0.70), respectively.

Conclusions and Relevance

In this study, the proposed patient data representation accurately predicted the age at which a patient was at risk of dying or developing major cardiovascular events substantially better than standard methods. The representation uses known relationships contained in electronic health records to capture disease severity in a natural and clinically meaningful way. Furthermore, it is expressive and interpretable. This novel patient representation can help to support critical decision-making, develop smart guidelines, and enhance health care and disease management by helping to identify patients with high risk.

Related collections

Most cited references 16

Record: found
Abstract: not found
Article: not found

Ridge Regression: Biased Estimation for Nonorthogonal Problems

Arthur E. Hoerl, Robert Kennard (1970)

0 comments Cited 1306 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent.

Noah Simon, Jerome Friedman, Trevor J. Hastie … (2011)

We introduce a pathwise algorithm for the Cox proportional hazards model, regularized by convex combinations of ℓ1 and ℓ2 penalties (elastic net). Our algorithm fits via cyclical coordinate descent, and employs warm starts to find a solution along a regularization path. We demonstrate the efficacy of our algorithm on real and simulated data sets, and find considerable speedup between our algorithm and competing methods.

0 comments Cited 811 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Multiple Chronic Conditions Among US Adults: A 2012 Update

Brian Ward, Jeannine S. Schiller, Richard A Goodman (2014)

The objective of this research was to update earlier estimates of prevalence rates of single chronic conditions and multiple (>2) chronic conditions (MCC) among the noninstitutionalized, civilian US adult population. Data from the 2012 National Health Interview Survey (NHIS) were used to generate estimates of MCC for US adults and by select demographic characteristics. Approximately half (117 million) of US adults have at least one of the 10 chronic conditions examined (ie, hypertension, coronary heart disease, stroke, diabetes, cancer, arthritis, hepatitis, weak or failing kidneys, current asthma, or chronic obstructive pulmonary disease [COPD]). Furthermore, 1 in 4 adults has MCC.

0 comments Cited 243 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): JAMA Netw Open

Journal ID (iso-abbrev): JAMA Netw Open

Journal ID (pmc): JAMA Netw Open

Title: JAMA Network Open

Publisher: American Medical Association

ISSN (Electronic): 2574-3805

Publication date (Electronic): 17 July 2020

Publication date Collection: July 2020

Publication date PMC-release: 17 July 2020

Volume: 3

Issue: 7

Electronic Location Identifier: e208270

Affiliations

[1 ]Division of Digital Health Science, Department of Health Science Research, Mayo Clinic, Rochester, Minnesota

[2 ]The Robert D. and Patricia E. Kern Center for the Science of Health Care Delivery, Mayo Clinic, Rochester, Minnesota

[3 ]Division of General Internal Medicine, Department of Internal Medicine, Mayo Clinic, Rochester, Minnesota

[4 ]Division of Healthcare Policy and Research, Department of Health Science Research, Mayo Clinic, Rochester, Minnesota

[5 ]University of Minnesota School of Nursing, Minneapolis

[6 ]Department of Computer Science and Engineering, University of Minnesota, Minneapolis

[7 ]Institute for Health Informatics, University of Minnesota, Minneapolis

[8 ]Department of Medicine, University of Minnesota, Minneapolis

Author notes

Article Information

Accepted for Publication: April 13, 2020.

Published: July 17, 2020. doi:10.1001/jamanetworkopen.2020.8270

Corresponding Author: Che Ngufor, PhD, Division of Digital Health Sciences, Department of Health Sciences Research, Mayo Clinic, 200 First Street SW, Rochester, MN 55905 ( ngufor.che@ 123456mayo.edu ).

Author Contributions: Drs Ngufor and Simon had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.

Concept and design: Ngufor, Caraballo, Steinbach, Simon.

Acquisition, analysis, or interpretation of data: All authors.

Drafting of the manuscript: Ngufor, Caraballo, Simon.

Critical revision of the manuscript for important intellectual content: All authors.

Statistical analysis: Ngufor, O'Byrne, Chen, Simon.

Obtained funding: Caraballo, Steinbach, Simon.

Administrative, technical, or material support: Ngufor, Caraballo.

Supervision: Caraballo, Steinbach, Simon.

Conflict of Interest Disclosures: Dr Ngufor reported receiving grants and nonfinancial support consisting of office space and supplies, computers, and other administrative services from the Mayo Clinic and the University of Minnesota during the conduct of the study. Dr Shah reported receiving research support through the Mayo Clinic from the Centers of Medicare & Medicaid Innovation, the US Food and Drug Administration, the Agency for Healthcare Research and Quality, the National Heart, Lung, and Blood Institute of the National Institutes of Health, the Medical Devices Innovation Consortium/National Evaluation System for Health Technology, the National Science Foundation, and the Patient-Centered Outcomes Research Institute. No other disclosures were reported.

Funding/Support: This research was supported by grants IIS-1602198 and IIS-1602394 from the National Science Foundation and grant LM11972 from the National Institutes of Health.

Role of the Funder/Sponsor: The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

Disclaimer: The views expressed in this manuscript are those of the authors and do not necessarily reflect the views of the National Science Foundation or the National Institutes of Health.

Article

Publisher ID: zoi200355

DOI: 10.1001/jamanetworkopen.2020.8270

PMC ID: 7368174

PubMed ID: 32678448

SO-VID: ae13408c-556b-424d-bd5c-dd9dd9bd1b61

License:

This is an open access article distributed under the terms of the CC-BY License.

History

Date received : 11 February 2020

Date accepted : 13 April 2020

Funding

Funded by: National Science Foundation

Funded by: National Institutes of Health

Comments

Comment on this article

scite_

Cited by 3

See all cited by

Most referenced authors 396

See all reference authors

Development and Validation of a Risk Stratification Model Using Disease Severity Hierarchy for Mortality or Major Cardiovascular Event

Read this article at

Key Points

Question

Findings

Meaning

Abstract

Abstract

Importance

Objective

Design, Setting, and Participants

Main Outcomes and Measures

Results

Conclusions and Relevance

Related collections

Genome Engineering using CRISPR

Most cited references 16

Ridge Regression: Biased Estimation for Nonorthogonal Problems

Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent.

Multiple Chronic Conditions Among US Adults: A 2012 Update

Author and article information

Journal

Affiliations

Author notes

Article

History

Funding

Categories

Comments

Comment on this article

Similar content 82

Cited by 3

Most referenced authors 396