0
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Reconstruction of Nuclear Ensemble Approach Electronic Spectra Using Probabilistic Machine Learning

      research-article
      ,
      Journal of Chemical Theory and Computation
      American Chemical Society

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The theoretical prediction of molecular electronic spectra by means of quantum mechanical (QM) computations is fundamental to gain a deep insight into many photophysical and photochemical processes. A computational strategy that is attracting significant attention is the so-called Nuclear Ensemble Approach (NEA), that relies on generating a representative ensemble of nuclear geometries around the equilibrium structure and computing the vertical excitation energies (Δ E) and oscillator strengths ( f) and phenomenologically broadening each transition with a line-shaped function with empirical full-width δ. Frequently, the choice of δ is carried out by visually finding the trade-off between artificial vibronic features (small δ) and over-smoothing of electronic signatures (large δ). Nevertheless, this approach is not satisfactory, as it relies on a subjective perception and may lead to spectral inaccuracies overall when the number of sampled configurations is limited due to an excessive computational burden (high-level QM methods, complex systems, solvent effects, etc.). In this work, we have developed and tested a new approach to reconstruct NEA spectra, dubbed GMM-NEA, based on the use of Gaussian Mixture Models (GMMs), a probabilistic machine learning algorithm, that circumvents the phenomenological broadening assumption and, in turn, the use of δ altogether. We show that GMM-NEA systematically outperforms other data-driven models to automatically select δ overall for small datasets. In addition, we report the use of an algorithm to detect anomalous QM computations (outliers) that can affect the overall shape and uncertainty of the NEA spectra. Finally, we apply GMM-NEA to predict the photolysis rate for HgBrOOH, a compound involved in Earth’s atmospheric chemistry.

          Related collections

          Most cited references59

          • Record: found
          • Abstract: found
          • Book: not found

          An Introduction to the Bootstrap

          Statistics is a subject of many uses and surprisingly few effective practitioners. The traditional road to statistical knowledge is blocked, for most, by a formidable wall of mathematics. The approach in An Introduction to the Bootstrap avoids that wall. It arms scientists and engineers, as well as statisticians, with the computational techniques they need to analyze and understand complicated data sets.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Anomaly detection: A survey

            Anomaly detection is an important problem that has been researched within diverse research areas and application domains. Many anomaly detection techniques have been specifically developed for certain application domains, while others are more generic. This survey tries to provide a structured and comprehensive overview of the research on anomaly detection. We have grouped existing techniques into different categories based on the underlying approach adopted by each technique. For each category we have identified key assumptions, which are used by the techniques to differentiate between normal and anomalous behavior. When applying a given technique to a particular domain, these assumptions can be used as guidelines to assess the effectiveness of the technique in that domain. For each category, we provide a basic anomaly detection technique, and then show how the different existing techniques in that category are variants of the basic technique. This template provides an easier and more succinct understanding of the techniques belonging to each category. Further, for each category, we identify the advantages and disadvantages of the techniques in that category. We also provide a discussion on the computational complexity of the techniques since it is an important issue in real application domains. We hope that this survey will provide a better understanding of the different directions in which research has been done on this topic, and how techniques developed in one area can be applied in domains for which they were not intended to begin with.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              mclust 5: Clustering, Classification and Density Estimation Using Gaussian Finite Mixture Models.

              Finite mixture models are being used increasingly to model a wide variety of random phenomena for clustering, classification and density estimation. mclust is a powerful and popular package which allows modelling of data as a Gaussian finite mixture with different covariance structures and different numbers of mixture components, for a variety of purposes of analysis. Recently, version 5 of the package has been made available on CRAN. This updated version adds new covariance structures, dimension reduction capabilities for visualisation, model selection criteria, initialisation strategies for the EM algorithm, and bootstrap-based inference, making it a full-featured R package for data analysis via finite mixture modelling.
                Bookmark

                Author and article information

                Journal
                J Chem Theory Comput
                J Chem Theory Comput
                ct
                jctcce
                Journal of Chemical Theory and Computation
                American Chemical Society
                1549-9618
                1549-9626
                28 April 2022
                10 May 2022
                : 18
                : 5
                : 3052-3064
                Affiliations
                Institut de Ciència Molecular, Universitat de València , València 46071, Spain
                Author notes
                Author information
                https://orcid.org/0000-0002-7174-2453
                https://orcid.org/0000-0001-6495-2770
                Article
                10.1021/acs.jctc.2c00004
                9097286
                35481363
                82573d1e-c4b6-417f-a634-2082696fb2e0
                © 2022 The Authors. Published by American Chemical Society

                Permits the broadest form of re-use including for commercial purposes, provided that author attribution and integrity are maintained ( https://creativecommons.org/licenses/by/4.0/).

                History
                : 02 January 2022
                Funding
                Funded by: Ministerio de Economía y Competitividad, doi 10.13039/501100003329;
                Award ID: CTQ2017-87054-C2-2-P
                Funded by: European Regional Development Fund, doi 10.13039/501100008530;
                Award ID: CTQ2017-87054-C2-2-P
                Funded by: Ministerio de Ciencia e Innovación, doi 10.13039/501100004837;
                Award ID: RYC2015-19234
                Categories
                Article
                Custom metadata
                ct2c00004
                ct2c00004

                Computational chemistry & Modeling
                Computational chemistry & Modeling

                Comments

                Comment on this article