7
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      IPDfromKM: reconstruct individual patient data from published Kaplan-Meier survival curves

      research-article

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          When applying secondary analysis on published survival data, it is critical to obtain each patient’s raw data, because the individual patient data (IPD) approach has been considered as the gold standard of data analysis. However, researchers often lack access to IPD. We aim to propose a straightforward and robust approach to obtain IPD from published survival curves with a user-friendly software platform.

          Results

          Improving upon existing methods, we propose an easy-to-use, two-stage approach to reconstruct IPD from published Kaplan-Meier (K-M) curves. Stage 1 extracts raw data coordinates and Stage 2 reconstructs IPD using the proposed method. To facilitate the use of the proposed method, we developed the R package IPDfromKM and an accompanying web-based Shiny application. Both the R package and Shiny application have an “all-in-one” feature such that users can use them to extract raw data coordinates from published K-M curves, reconstruct IPD from the extracted data coordinates, visualize the reconstructed IPD, assess the accuracy of the reconstruction, and perform secondary analysis on the basis of the reconstructed IPD. We illustrate the use of the R package and the Shiny application with K-M curves from published studies. Extensive simulations and real-world data applications demonstrate that the proposed method has high accuracy and great reliability in estimating the number of events, number of patients at risk, survival probabilities, median survival times, and hazard ratios.

          Conclusions

          IPDfromKM has great flexibility and accuracy to reconstruct IPD from published K-M curves with different shapes. We believe that the R package and the Shiny application will greatly facilitate the potential use of quality IPD and advance the use of secondary data to facilitate informed decision making in medical research.

          Related collections

          Most cited references12

          • Record: found
          • Abstract: not found
          • Article: not found

          Nonparametric Estimation from Incomplete Observations

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Atezolizumab versus docetaxel for patients with previously treated non-small-cell lung cancer (POPLAR): a multicentre, open-label, phase 2 randomised controlled trial.

            Outcomes are poor for patients with previously treated, advanced or metastatic non-small-cell lung cancer (NSCLC). The anti-programmed death ligand 1 (PD-L1) antibody atezolizumab is clinically active against cancer, including NSCLC, especially cancers expressing PD-L1 on tumour cells, tumour-infiltrating immune cells, or both. We assessed efficacy and safety of atezolizumab versus docetaxel in previously treated NSCLC, analysed by PD-L1 expression levels on tumour cells and tumour-infiltrating immune cells and in the intention-to-treat population.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Enhanced secondary analysis of survival data: reconstructing the data from published Kaplan-Meier survival curves

              Background The results of Randomized Controlled Trials (RCTs) on time-to-event outcomes that are usually reported are median time to events and Cox Hazard Ratio. These do not constitute the sufficient statistics required for meta-analysis or cost-effectiveness analysis, and their use in secondary analyses requires strong assumptions that may not have been adequately tested. In order to enhance the quality of secondary data analyses, we propose a method which derives from the published Kaplan Meier survival curves a close approximation to the original individual patient time-to-event data from which they were generated. Methods We develop an algorithm that maps from digitised curves back to KM data by finding numerical solutions to the inverted KM equations, using where available information on number of events and numbers at risk. The reproducibility and accuracy of survival probabilities, median survival times and hazard ratios based on reconstructed KM data was assessed by comparing published statistics (survival probabilities, medians and hazard ratios) with statistics based on repeated reconstructions by multiple observers. Results The validation exercise established there was no material systematic error and that there was a high degree of reproducibility for all statistics. Accuracy was excellent for survival probabilities and medians, for hazard ratios reasonable accuracy can only be obtained if at least numbers at risk or total number of events are reported. Conclusion The algorithm is a reliable tool for meta-analysis and cost-effectiveness analyses of RCTs reporting time-to-event data. It is recommended that all RCTs should report information on numbers at risk and total number of events alongside KM curves.
                Bookmark

                Author and article information

                Contributors
                nliu1@mdanderson.org
                yzhou13@mdanderson.org
                jjlee@mdanderson.org
                Journal
                BMC Med Res Methodol
                BMC Med Res Methodol
                BMC Medical Research Methodology
                BioMed Central (London )
                1471-2288
                1 June 2021
                1 June 2021
                2021
                : 21
                : 111
                Affiliations
                GRID grid.240145.6, ISNI 0000 0001 2291 4776, Department of Biostatistics, , The University of Texas, MD Anderson Cancer Center, ; Houston, United States
                Author information
                http://orcid.org/0000-0001-5469-9214
                Article
                1308
                10.1186/s12874-021-01308-8
                8168323
                28f8eb1a-b87a-4a7d-a554-767cb7a93f98
                © The Author(s) 2021

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

                History
                : 17 November 2020
                : 6 May 2021
                Funding
                Funded by: National Cancer Institute
                Award ID: CA016672
                Funded by: National Cancer Institute
                Award ID: CA221703
                Funded by: FundRef http://dx.doi.org/10.13039/100004917, Cancer Prevention and Research Institute of Texas;
                Award ID: RP150519
                Funded by: FundRef http://dx.doi.org/10.13039/100004917, Cancer Prevention and Research Institute of Texas;
                Award ID: RP160668
                Funded by: FundRef http://dx.doi.org/10.13039/100007313, University of Texas MD Anderson Cancer Center;
                Award ID: Oropharynx Cancer Program
                Categories
                Software
                Custom metadata
                © The Author(s) 2021

                Medicine
                individual patient data (ipd),kaplan-meier curve,meta-analysis,r package,shiny application,survival analysis

                Comments

                Comment on this article