0
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: not found
      • Article: not found

      Overly optimistic prediction results on imbalanced data: a case study of flaws and benefits when applying over-sampling

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Related collections

          Most cited references51

          • Record: found
          • Abstract: found
          • Article: not found

          SMOTE: Synthetic Minority Over-sampling Technique

          An approach to the construction of classifiers from imbalanced datasets is described. A dataset is imbalanced if the classification categories are not approximately equally represented. Often real-world data sets are predominately composed of ``normal'' examples with only a small percentage of ``abnormal'' or ``interesting'' examples. It is also the case that the cost of misclassifying an abnormal (interesting) example as a normal example is often much higher than the cost of the reverse error. Under-sampling of the majority (normal) class has been proposed as a good means of increasing the sensitivity of a classifier to the minority class. This paper shows that a combination of our method of over-sampling the minority (abnormal) class and under-sampling the majority (normal) class can achieve better classifier performance (in ROC space) than only under-sampling the majority class. This paper also shows that a combination of our method of over-sampling the minority class and under-sampling the majority class can achieve better classifier performance (in ROC space) than varying the loss ratios in Ripper or class priors in Naive Bayes. Our method of over-sampling the minority class involves creating synthetic minority class examples. Experiments are performed using C4.5, Ripper and a Naive Bayes classifier. The method is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            PhysioBank, PhysioToolkit, and PhysioNet

            Circulation, 101(23)
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Global, regional, and national causes of under-5 mortality in 2000–15: an updated systematic analysis with implications for the Sustainable Development Goals

              Summary Background Despite remarkable progress in the improvement of child survival between 1990 and 2015, the Millennium Development Goal (MDG) 4 target of a two-thirds reduction of under-5 mortality rate (U5MR) was not achieved globally. In this paper, we updated our annual estimates of child mortality by cause to 2000–15 to reflect on progress toward the MDG 4 and consider implications for the Sustainable Development Goals (SDG) target for child survival. Methods We increased the estimation input data for causes of deaths by 43% among neonates and 23% among 1–59-month-olds, respectively. We used adequate vital registration (VR) data where available, and modelled cause-specific mortality fractions applying multinomial logistic regressions using adequate VR for low U5MR countries and verbal autopsy data for high U5MR countries. We updated the estimation to use Plasmodium falciparum parasite rate in place of malaria index in the modelling of malaria deaths; to use adjusted empirical estimates instead of modelled estimates for China; and to consider the effects of pneumococcal conjugate vaccine and rotavirus vaccine in the estimation. Findings In 2015, among the 5·9 million under-5 deaths, 2·7 million occurred in the neonatal period. The leading under-5 causes were preterm birth complications (1·055 million [95% uncertainty range (UR) 0·935–1·179]), pneumonia (0·921 million [0·812 −1·117]), and intrapartum-related events (0·691 million [0·598 −0·778]). In the two MDG regions with the most under-5 deaths, the leading cause was pneumonia in sub-Saharan Africa and preterm birth complications in southern Asia. Reductions in mortality rates for pneumonia, diarrhoea, neonatal intrapartum-related events, malaria, and measles were responsible for 61% of the total reduction of 35 per 1000 livebirths in U5MR in 2000–15. Stratified by U5MR, pneumonia was the leading cause in countries with very high U5MR. Preterm birth complications and pneumonia were both important in high, medium high, and medium child mortality countries; whereas congenital abnormalities was the most important cause in countries with low and very low U5MR. Interpretation In the SDG era, countries are advised to prioritise child survival policy and programmes based on their child cause-of-death composition. Continued and enhanced efforts to scale up proven life-saving interventions are needed to achieve the SDG child survival target. Funding Bill & Melinda Gates Foundation, WHO.
                Bookmark

                Author and article information

                Journal
                Artificial Intelligence in Medicine
                Artificial Intelligence in Medicine
                Elsevier BV
                09333657
                January 2021
                January 2021
                : 111
                : 101987
                Article
                10.1016/j.artmed.2020.101987
                33461687
                085859ae-9718-496c-b0ae-8cf088738108
                © 2021

                https://www.elsevier.com/tdm/userlicense/1.0/

                History

                Comments

                Comment on this article