4
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: not found
      • Article: not found

      Water quality classification using machine learning algorithms

      Read this article at

      ScienceOpenPublisher
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Related collections

          Most cited references73

          • Record: found
          • Abstract: found
          • Article: not found

          A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models

          The objective of this study was to compare performance of logistic regression (LR) with machine learning (ML) for clinical prediction modeling in the literature.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes.

            J V Tu (1996)
            Artificial neural networks are algorithms that can be used to perform nonlinear statistical modeling and provide a new alternative to logistic regression, the most commonly used method for developing predictive models for dichotomous outcomes in medicine. Neural networks offer a number of advantages, including requiring less formal statistical training, ability to implicitly detect complex nonlinear relationships between dependent and independent variables, ability to detect all possible interactions between predictor variables, and the availability of multiple training algorithms. Disadvantages include its "black box" nature, greater computational burden, proneness to overfitting, and the empirical nature of model development. An overview of the features of neural networks and logistic regression is presented, and the advantages and disadvantages of using this modeling technique are discussed.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              CatBoost for big data: an interdisciplinary review

              Gradient Boosted Decision Trees (GBDT’s) are a powerful tool for classification and regression tasks in Big Data. Researchers should be familiar with the strengths and weaknesses of current implementations of GBDT’s in order to use them effectively and make successful contributions. CatBoost is a member of the family of GBDT machine learning ensemble techniques. Since its debut in late 2018, researchers have successfully used CatBoost for machine learning studies involving Big Data. We take this opportunity to review recent research on CatBoost as it relates to Big Data, and learn best practices from studies that cast CatBoost in a positive light, as well as studies where CatBoost does not outshine other techniques, since we can learn lessons from both types of scenarios. Furthermore, as a Decision Tree based algorithm, CatBoost is well-suited to machine learning tasks involving categorical, heterogeneous data. Recent work across multiple disciplines illustrates CatBoost’s effectiveness and shortcomings in classification and regression tasks. Another important issue we expose in literature on CatBoost is its sensitivity to hyper-parameters and the importance of hyper-parameter tuning. One contribution we make is to take an interdisciplinary approach to cover studies related to CatBoost in a single work. This provides researchers an in-depth understanding to help clarify proper application of CatBoost in solving problems. To the best of our knowledge, this is the first survey that studies all works related to CatBoost in a single publication.
                Bookmark

                Author and article information

                Journal
                Journal of Water Process Engineering
                Journal of Water Process Engineering
                Elsevier BV
                22147144
                August 2022
                August 2022
                : 48
                : 102920
                Article
                10.1016/j.jwpe.2022.102920
                68e2e9bb-3b7b-4b59-a6c7-096fe467f4d1
                © 2022

                https://www.elsevier.com/tdm/userlicense/1.0/

                https://doi.org/10.15223/policy-017

                https://doi.org/10.15223/policy-037

                https://doi.org/10.15223/policy-012

                https://doi.org/10.15223/policy-029

                https://doi.org/10.15223/policy-004

                History

                Comments

                Comment on this article