14
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Drinking Water Resources Suitability Assessment Based on Pollution Index of Groundwater Using Improved Explainable Artificial Intelligence

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The global significance of fluoride and nitrate contamination in coastal areas cannot be overstated, as these contaminants pose critical environmental and public health challenges across the world. Water quality is an essential component in sustaining environmental health. This integrated study aimed to assess indexical and spatial water quality, potential contamination sources, and health risks associated with groundwater resources in Al-Hassa, Saudi Arabia. Groundwater samples were tested using standard methods. The physiochemical results indicated overall groundwater pollution. This study addresses the critical issue of drinking water resource suitability assessment by introducing an innovative approach based on the pollution index of groundwater (PIG). Focusing on the eastern region of Saudi Arabia, where water resource management is of paramount importance, we employed advanced machine learning (ML) models to forecast groundwater suitability using several combinations (C1 = EC + Na + Mg + Cl, C2 = TDS + TA + HCO3 + K + Ca, and C3 = SO4 + pH + NO3 + F + Turb). Six ML models, including random forest (RF), decision trees (DT), XgBoost, CatBoost, linear regression, and support vector machines (SVM), were utilized to predict groundwater quality. These models, based on several performance criteria (MAPE, MAE, MSE, and DC), offer valuable insights into the complex relationships governing groundwater pollution with an accuracy of more than 90%. To enhance the transparency and interpretability of the ML models, we incorporated the local interpretable model-agnostic explanation method, SHapley Additive exPlanations (SHAP). SHAP allows us to interpret the prediction-making process of otherwise opaque black-box models. We believe that the integration of ML models and SHAP-based explainability offers a promising avenue for sustainable water resource management in Saudi Arabia and can serve as a model for addressing similar challenges worldwide. By bridging the gap between complex data-driven predictions and actionable insights, this study contributes to the advancement of environmental stewardship and water security in the region.

          Related collections

          Most cited references40

          • Record: found
          • Abstract: not found
          • Article: not found

          Scientific discovery in the age of artificial intelligence

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            The global volume and distribution of modern groundwater

              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              CatBoost for big data: an interdisciplinary review

              Gradient Boosted Decision Trees (GBDT’s) are a powerful tool for classification and regression tasks in Big Data. Researchers should be familiar with the strengths and weaknesses of current implementations of GBDT’s in order to use them effectively and make successful contributions. CatBoost is a member of the family of GBDT machine learning ensemble techniques. Since its debut in late 2018, researchers have successfully used CatBoost for machine learning studies involving Big Data. We take this opportunity to review recent research on CatBoost as it relates to Big Data, and learn best practices from studies that cast CatBoost in a positive light, as well as studies where CatBoost does not outshine other techniques, since we can learn lessons from both types of scenarios. Furthermore, as a Decision Tree based algorithm, CatBoost is well-suited to machine learning tasks involving categorical, heterogeneous data. Recent work across multiple disciplines illustrates CatBoost’s effectiveness and shortcomings in classification and regression tasks. Another important issue we expose in literature on CatBoost is its sensitivity to hyper-parameters and the importance of hyper-parameter tuning. One contribution we make is to take an interdisciplinary approach to cover studies related to CatBoost in a single work. This provides researchers an in-depth understanding to help clarify proper application of CatBoost in solving problems. To the best of our knowledge, this is the first survey that studies all works related to CatBoost in a single publication.
                Bookmark

                Author and article information

                Contributors
                (View ORCID Profile)
                (View ORCID Profile)
                (View ORCID Profile)
                (View ORCID Profile)
                (View ORCID Profile)
                Journal
                SUSTDE
                Sustainability
                Sustainability
                MDPI AG
                2071-1050
                November 2023
                November 06 2023
                : 15
                : 21
                : 15655
                Article
                10.3390/su152115655
                01b85f20-bace-48a3-96fb-a25a061721ef
                © 2023

                https://creativecommons.org/licenses/by/4.0/

                History

                Comments

                Comment on this article

                scite_
                0
                0
                0
                0
                Smart Citations
                0
                0
                0
                0
                Citing PublicationsSupportingMentioningContrasting
                View Citations

                See how this article has been cited at scite.ai

                scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

                Similar content118

                Cited by3

                Most referenced authors329