11
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Comparing different machine learning techniques for predicting COVID-19 severity

      research-article
      1 , 1 , 2 , 3 , 1 , , 4 , , the National Traditional Chinese Medicine Medical Team
      Infectious Diseases of Poverty
      BioMed Central
      COVID-19, Severity, Machine learning, Support vector machine, Random Forest, Logistic regression

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Coronavirus disease 2019 (COVID-19) is still ongoing spreading globally, machine learning techniques were used in disease diagnosis and to predict treatment outcomes, which showed favorable performance. The present study aims to predict COVID-19 severity at admission by different machine learning techniques including random forest (RF), support vector machine (SVM), and logistic regression (LR). Feature importance to COVID-19 severity were further identified.

          Methods

          A retrospective design was adopted in the JinYinTan Hospital from January 26 to March 28, 2020, eighty-six demographic, clinical, and laboratory features were selected with LassoCV method, Spearman’s rank correlation, experts’ opinions, and literature evaluation. RF, SVM, and LR were performed to predict severe COVID-19, the performance of the models was compared by the area under curve (AUC). Additionally, feature importance to COVID-19 severity were analyzed by the best performance model.

          Results

          A total of 287 patients were enrolled with 36.6% severe cases and 63.4% non-severe cases. The median age was 60.0 years (interquartile range: 49.0–68.0 years). Three models were established using 23 features including 1 clinical, 1 chest computed tomography (CT) and 21 laboratory features. Among three models, RF yielded better overall performance with the highest AUC of 0.970 than SVM of 0.948 and LR of 0.928, RF also achieved a favorable sensitivity of 96.7%, specificity of 69.5%, and accuracy of 84.5%. SVM had sensitivity of 93.9%, specificity of 79.0%, and accuracy of 88.5%. LR also achieved a favorable sensitivity of 92.3%, specificity of 72.3%, and accuracy of 85.2%. Additionally, chest-CT had highest importance to illness severity, and the following features were neutrophil to lymphocyte ratio, lactate dehydrogenase, and D-dimer, respectively.

          Conclusions

          Our results indicated that RF could be a useful predictive tool to identify patients with severe COVID-19, which may facilitate effective care and further optimize resources.

          Graphical Abstract

          Supplementary Information

          The online version contains supplementary material available at 10.1186/s40249-022-00946-4.

          Related collections

          Most cited references41

          • Record: found
          • Abstract: found
          • Article: not found

          Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study

          Summary Background Since December, 2019, Wuhan, China, has experienced an outbreak of coronavirus disease 2019 (COVID-19), caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Epidemiological and clinical characteristics of patients with COVID-19 have been reported but risk factors for mortality and a detailed clinical course of illness, including viral shedding, have not been well described. Methods In this retrospective, multicentre cohort study, we included all adult inpatients (≥18 years old) with laboratory-confirmed COVID-19 from Jinyintan Hospital and Wuhan Pulmonary Hospital (Wuhan, China) who had been discharged or had died by Jan 31, 2020. Demographic, clinical, treatment, and laboratory data, including serial samples for viral RNA detection, were extracted from electronic medical records and compared between survivors and non-survivors. We used univariable and multivariable logistic regression methods to explore the risk factors associated with in-hospital death. Findings 191 patients (135 from Jinyintan Hospital and 56 from Wuhan Pulmonary Hospital) were included in this study, of whom 137 were discharged and 54 died in hospital. 91 (48%) patients had a comorbidity, with hypertension being the most common (58 [30%] patients), followed by diabetes (36 [19%] patients) and coronary heart disease (15 [8%] patients). Multivariable regression showed increasing odds of in-hospital death associated with older age (odds ratio 1·10, 95% CI 1·03–1·17, per year increase; p=0·0043), higher Sequential Organ Failure Assessment (SOFA) score (5·65, 2·61–12·23; p<0·0001), and d-dimer greater than 1 μg/mL (18·42, 2·64–128·55; p=0·0033) on admission. Median duration of viral shedding was 20·0 days (IQR 17·0–24·0) in survivors, but SARS-CoV-2 was detectable until death in non-survivors. The longest observed duration of viral shedding in survivors was 37 days. Interpretation The potential risk factors of older age, high SOFA score, and d-dimer greater than 1 μg/mL could help clinicians to identify patients with poor prognosis at an early stage. Prolonged viral shedding provides the rationale for a strategy of isolation of infected patients and optimal antiviral interventions in the future. Funding Chinese Academy of Medical Sciences Innovation Fund for Medical Sciences; National Science Grant for Distinguished Young Scholars; National Key Research and Development Program of China; The Beijing Science and Technology Project; and Major Projects of National Science and Technology on New Drug Creation and Development.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Characteristics of SARS-CoV-2 and COVID-19

            Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a highly transmissible and pathogenic coronavirus that emerged in late 2019 and has caused a pandemic of acute respiratory disease, named ‘coronavirus disease 2019’ (COVID-19), which threatens human health and public safety. In this Review, we describe the basic virology of SARS-CoV-2, including genomic characteristics and receptor use, highlighting its key difference from previously known coronaviruses. We summarize current knowledge of clinical, epidemiological and pathological features of COVID-19, as well as recent progress in animal models and antiviral treatment approaches for SARS-CoV-2 infection. We also discuss the potential wildlife hosts and zoonotic origin of this emerging virus in detail.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal

              Abstract Objective To review and critically appraise published and preprint reports of prediction models for diagnosing coronavirus disease 2019 (covid-19) in patients with suspected infection, for prognosis of patients with covid-19, and for detecting people in the general population at risk of being admitted to hospital for covid-19 pneumonia. Design Rapid systematic review and critical appraisal. Data sources PubMed and Embase through Ovid, Arxiv, medRxiv, and bioRxiv up to 24 March 2020. Study selection Studies that developed or validated a multivariable covid-19 related prediction model. Data extraction At least two authors independently extracted data using the CHARMS (critical appraisal and data extraction for systematic reviews of prediction modelling studies) checklist; risk of bias was assessed using PROBAST (prediction model risk of bias assessment tool). Results 2696 titles were screened, and 27 studies describing 31 prediction models were included. Three models were identified for predicting hospital admission from pneumonia and other events (as proxy outcomes for covid-19 pneumonia) in the general population; 18 diagnostic models for detecting covid-19 infection (13 were machine learning based on computed tomography scans); and 10 prognostic models for predicting mortality risk, progression to severe disease, or length of hospital stay. Only one study used patient data from outside of China. The most reported predictors of presence of covid-19 in patients with suspected disease included age, body temperature, and signs and symptoms. The most reported predictors of severe prognosis in patients with covid-19 included age, sex, features derived from computed tomography scans, C reactive protein, lactic dehydrogenase, and lymphocyte count. C index estimates ranged from 0.73 to 0.81 in prediction models for the general population (reported for all three models), from 0.81 to more than 0.99 in diagnostic models (reported for 13 of the 18 models), and from 0.85 to 0.98 in prognostic models (reported for six of the 10 models). All studies were rated at high risk of bias, mostly because of non-representative selection of control patients, exclusion of patients who had not experienced the event of interest by the end of the study, and high risk of model overfitting. Reporting quality varied substantially between studies. Most reports did not include a description of the study population or intended use of the models, and calibration of predictions was rarely assessed. Conclusion Prediction models for covid-19 are quickly entering the academic literature to support medical decision making at a time when they are urgently needed. This review indicates that proposed models are poorly reported, at high risk of bias, and their reported performance is probably optimistic. Immediate sharing of well documented individual participant data from covid-19 studies is needed for collaborative efforts to develop more rigorous prediction models and validate existing ones. The predictors identified in included studies could be considered as candidate predictors for new models. Methodological guidance should be followed because unreliable predictions could cause more harm than benefit in guiding clinical decisions. Finally, studies should adhere to the TRIPOD (transparent reporting of a multivariable prediction model for individual prognosis or diagnosis) reporting guideline. Systematic review registration Protocol https://osf.io/ehc47/, registration https://osf.io/wy245.
                Bookmark

                Author and article information

                Contributors
                lv_cheng0816@163.com
                Huangluqi01@126.com
                Journal
                Infect Dis Poverty
                Infect Dis Poverty
                Infectious Diseases of Poverty
                BioMed Central (London )
                2095-5162
                2049-9957
                17 February 2022
                17 February 2022
                2022
                : 11
                : 19
                Affiliations
                [1 ]GRID grid.410318.f, ISNI 0000 0004 0632 3409, Institute of Basic Research in Clinical Medicine, , China Academy of Chinese Medical Sciences, ; No. 16, Nanxiao Street, Dongzhimen, Dongcheng District, Beijing, 100700 Beijing China
                [2 ]GRID grid.507952.c, ISNI 0000 0004 1764 577X, Department of Infectious Diseases, , JinYinTan Hospital, ; Wuhan, 430040 China
                [3 ]GRID grid.198530.6, ISNI 0000 0000 8803 2373, Information Center, , Chinese Center for Disease Control and Prevention, ; Beijing, 102206 China
                [4 ]GRID grid.410318.f, ISNI 0000 0004 0632 3409, National Resource Center for Chinese Materia Medica, , China Academy of Chinese Medical Sciences, ; No. 16, Nanxiao Street, Dongzhimen, Dongcheng District, Beijing, 100700 Beijing China
                Article
                946
                10.1186/s40249-022-00946-4
                8851750
                35177120
                a0115170-403d-4555-b2c4-0d61d01e18ad
                © The Author(s) 2022

                Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

                History
                : 31 August 2021
                : 9 February 2022
                Categories
                Research Article
                Custom metadata
                © The Author(s) 2022

                covid-19,severity,machine learning,support vector machine,random forest,logistic regression

                Comments

                Comment on this article