6
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Claims‐Based Algorithms for Identifying Patients With Pulmonary Hypertension: A Comparison of Decision Rules and Machine‐Learning Approaches

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Real‐world healthcare data are an important resource for epidemiologic research. However, accurate identification of patient cohorts—a crucial first step underpinning the validity of research results—remains a challenge. We developed and evaluated claims‐based case ascertainment algorithms for pulmonary hypertension (PH), comparing conventional decision rules with state‐of‐the‐art machine‐learning approaches.

          Methods and Results

          We analyzed an electronic health record‐Medicare linked database from two large academic tertiary care hospitals (years 2007–2013). Electronic health record charts were reviewed to form a gold standard cohort of patients with (n=386) and without PH (n=164). Using health encounter data captured in Medicare claims (including patients’ demographics, diagnoses, medications, and procedures), we developed and compared 2 approaches for identifying patients with PH: decision rules and machine‐learning algorithms using penalized lasso regression, random forest, and gradient boosting machine. The most optimal rule‐based algorithm—having ≥3 PH‐related healthcare encounters and having undergone right heart catheterization—attained an area under the receiver operating characteristic curve of 0.64 (sensitivity, 0.75; specificity, 0.48). All 3 machine‐learning algorithms outperformed the most optimal rule‐based algorithm ( P<0.001). A model derived from the random forest algorithm achieved an area under the receiver operating characteristic curve of 0.88 (sensitivity, 0.87; specificity, 0.70), and gradient boosting machine achieved comparable results (area under the receiver operating characteristic curve, 0.85; sensitivity, 0.87; specificity, 0.70). Penalized lasso regression achieved an area under the receiver operating characteristic curve of 0.73 (sensitivity, 0.70; specificity, 0.68).

          Conclusions

          Research‐grade case identification algorithms for PH can be derived and rigorously validated using machine‐learning algorithms. Simple decision rules commonly applied in published literature performed poorly; more complex rule‐based algorithms may potentially address the limitation of this approach. PH research using claims data would be considerably strengthened through the use of validated algorithms for cohort ascertainment.

          Related collections

          Most cited references23

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Haemodynamic definitions and updated clinical classification of pulmonary hypertension

          Since the 1st World Symposium on Pulmonary Hypertension (WSPH) in 1973, pulmonary hypertension (PH) has been arbitrarily defined as mean pulmonary arterial pressure (mPAP) ≥25 mmHg at rest, measured by right heart catheterisation. Recent data from normal subjects has shown that normal mPAP was 14.0±3.3 mmHg. Two standard deviations above this mean value would suggest mPAP >20 mmHg as above the upper limit of normal (above the 97.5th percentile). This definition is no longer arbitrary, but based on a scientific approach. However, this abnormal elevation of mPAP is not sufficient to define pulmonary vascular disease as it can be due to an increase in cardiac output or pulmonary arterial wedge pressure. Thus, this 6th WSPH Task Force proposes to include pulmonary vascular resistance ≥3 Wood Units in the definition of all forms of pre-capillary PH associated with mPAP >20 mmHg. Prospective trials are required to determine whether this PH population might benefit from specific management. Regarding clinical classification, the main Task Force changes were the inclusion in group 1 of a subgroup “pulmonary arterial hypertension (PAH) long-term responders to calcium channel blockers”, due to the specific prognostic and management of these patients, and a subgroup “PAH with overt features of venous/capillaries (pulmonary veno-occlusive disease/pulmonary capillary haemangiomatosis) involvement”, due to evidence suggesting a continuum between arterial, capillary and vein involvement in PAH.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            A method of comparing the areas under receiver operating characteristic curves derived from the same cases.

            Receiver operating characteristic (ROC) curves are used to describe and compare the performance of diagnostic technology and diagnostic algorithms. This paper refines the statistical comparison of the areas under two ROC curves derived from the same set of patients by taking into account the correlation between the areas that is induced by the paired nature of the data. The correspondence between the area under an ROC curve and the Wilcoxon statistic is used and underlying Gaussian distributions (binormal) are assumed to provide a table that converts the observed correlations in paired ratings of images into a correlation between the two ROC areas. This between-area correlation can be used to reduce the standard error (uncertainty) about the observed difference in areas. This correction for pairing, analogous to that used in the paired t-test, can produce a considerable increase in the statistical sensitivity (power) of the comparison. For studies involving multiple readers, this method provides a measure of a component of the sampling variation that is otherwise difficult to obtain.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Regression shrinkage and selection via the lasso: a retrospective

                Bookmark

                Author and article information

                Contributors
                mei-sing_ong@hms.harvard.edu
                Journal
                J Am Heart Assoc
                J Am Heart Assoc
                10.1002/(ISSN)2047-9980
                JAH3
                ahaoa
                Journal of the American Heart Association: Cardiovascular and Cerebrovascular Disease
                John Wiley and Sons Inc. (Hoboken )
                2047-9980
                29 September 2020
                06 October 2020
                : 9
                : 19 ( doiID: 10.1002/jah3.v9.19 )
                : e016648
                Affiliations
                [ 1 ] Department of Population Medicine Harvard Medical School & Harvard Pilgrim Health Care Institute Boston MA
                [ 2 ] Computational Health Informatics Program Boston Children’s Hospital Boston MA
                [ 3 ] Laboratory of Computer Science Massachusetts General Hospital Harvard Medical School Boston MA
                [ 4 ] Division of Pharmacoepidemiology and Pharmacoeconomics Department of Medicine Brigham and Women’s Hospital Harvard Medical School Boston MA
                [ 5 ] Cardiovascular Division Department of Medicine Brigham and Women’s Hospital Harvard Medical School Boston MA
                [ 6 ] Department of Neurology Massachusetts General Hospital, Harvard Medical School Boston MA
                [ 7 ] Department of Pediatrics Harvard Medical School Boston MA
                [ 8 ] Department of Biomedical Informatics Harvard Medical School Boston MA
                Author notes
                [*] [* ] Correspondence to: Mei‐Sing Ong, PhD, Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, 401 Park Drive, Suite 401E, Boston, MA 02115, USA. E‐mail: mei-sing_ong@ 123456hms.harvard.edu

                Author information
                https://orcid.org/0000-0002-1241-1013
                https://orcid.org/0000-0003-2043-1601
                https://orcid.org/0000-0002-6784-764X
                https://orcid.org/0000-0003-0748-2674
                Article
                JAH35505
                10.1161/JAHA.120.016648
                7792386
                32990147
                9869b7a8-a033-4acc-a3f5-156531a943ba
                © 2020 The Authors. Published on behalf of the American Heart Association, Inc., by Wiley.

                This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc-nd/4.0/ License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made.

                History
                : 17 March 2020
                : 05 August 2020
                Page count
                Figures: 1, Tables: 6, Pages: 10, Words: 6324
                Funding
                Funded by: National Heart, Lung, And Blood Institute , open-funder-registry 10.13039/100000050;
                Funded by: National Institutes of Health , open-funder-registry 10.13039/100000002;
                Award ID: U01HL121518
                Funded by: National Center for Advancing Translational Sciences (NCATS)
                Award ID: U01TR002623
                Funded by: Patient Centered Outcomes Research Institute , open-funder-registry 10.13039/100006093;
                Award ID: CDRN‐1306‐04864
                Categories
                Original Research
                Original Research
                Epidemiology
                Custom metadata
                2.0
                06 October 2020
                Converter:WILEY_ML3GV2_TO_JATSPMC version:5.9.2 mode:remove_FC converted:13.10.2020

                Cardiovascular Medicine
                computable phenotype,machine learning,pulmonary hypertension
                Cardiovascular Medicine
                computable phenotype, machine learning, pulmonary hypertension

                Comments

                Comment on this article