Reconstruction of Nuclear Ensemble Approach Electronic Spectra Using Probabilistic Machine Learning

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

The theoretical prediction of molecular electronic spectra by means of quantum mechanical (QM) computations is fundamental to gain a deep insight into many photophysical and photochemical processes. A computational strategy that is attracting significant attention is the so-called Nuclear Ensemble Approach (NEA), that relies on generating a representative ensemble of nuclear geometries around the equilibrium structure and computing the vertical excitation energies (Δ E) and oscillator strengths ( f) and phenomenologically broadening each transition with a line-shaped function with empirical full-width δ. Frequently, the choice of δ is carried out by visually finding the trade-off between artificial vibronic features (small δ) and over-smoothing of electronic signatures (large δ). Nevertheless, this approach is not satisfactory, as it relies on a subjective perception and may lead to spectral inaccuracies overall when the number of sampled configurations is limited due to an excessive computational burden (high-level QM methods, complex systems, solvent effects, etc.). In this work, we have developed and tested a new approach to reconstruct NEA spectra, dubbed GMM-NEA, based on the use of Gaussian Mixture Models (GMMs), a probabilistic machine learning algorithm, that circumvents the phenomenological broadening assumption and, in turn, the use of δ altogether. We show that GMM-NEA systematically outperforms other data-driven models to automatically select δ overall for small datasets. In addition, we report the use of an algorithm to detect anomalous QM computations (outliers) that can affect the overall shape and uncertainty of the NEA spectra. Finally, we apply GMM-NEA to predict the photolysis rate for HgBrOOH, a compound involved in Earth’s atmospheric chemistry.

Related collections

Most cited references 59

Record: found
Abstract: found
Book: not found

An Introduction to the Bootstrap

Bradley Efron, R.J. Tibshirani (1994)

Statistics is a subject of many uses and surprisingly few effective practitioners. The traditional road to statistical knowledge is blocked, for most, by a formidable wall of mathematics. The approach in An Introduction to the Bootstrap avoids that wall. It arms scientists and engineers, as well as statisticians, with the computational techniques they need to analyze and understand complicated data sets.

0 comments Cited 3166 times – based on 0 reviews

Bookmark

Record: found
Abstract: found
Article: not found

Anomaly detection: A survey

Varun Chandola, Arindam Banerjee, Vipin Kumar (2009)

Anomaly detection is an important problem that has been researched within diverse research areas and application domains. Many anomaly detection techniques have been specifically developed for certain application domains, while others are more generic. This survey tries to provide a structured and comprehensive overview of the research on anomaly detection. We have grouped existing techniques into different categories based on the underlying approach adopted by each technique. For each category we have identified key assumptions, which are used by the techniques to differentiate between normal and anomalous behavior. When applying a given technique to a particular domain, these assumptions can be used as guidelines to assess the effectiveness of the technique in that domain. For each category, we provide a basic anomaly detection technique, and then show how the different existing techniques in that category are variants of the basic technique. This template provides an easier and more succinct understanding of the techniques belonging to each category. Further, for each category, we identify the advantages and disadvantages of the techniques in that category. We also provide a discussion on the computational complexity of the techniques since it is an important issue in real application domains. We hope that this survey will provide a better understanding of the different directions in which research has been done on this topic, and how techniques developed in one area can be applied in domains for which they were not intended to begin with.

0 comments Cited 691 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

mclust 5: Clustering, Classification and Density Estimation Using Gaussian Finite Mixture Models.

Luca Scrucca, Michael Fop, T.,Brendan Murphy … (2016)

Finite mixture models are being used increasingly to model a wide variety of random phenomena for clustering, classification and density estimation. mclust is a powerful and popular package which allows modelling of data as a Gaussian finite mixture with different covariance structures and different numbers of mixture components, for a variety of purposes of analysis. Recently, version 5 of the package has been made available on CRAN. This updated version adds new covariance structures, dimension reduction capabilities for visualisation, model selection criteria, initialisation strategies for the EM algorithm, and bootstrap-based inference, making it a full-featured R package for data analysis via finite mixture modelling.

0 comments Cited 514 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): J Chem Theory Comput

Journal ID (iso-abbrev): J Chem Theory Comput

Journal ID (publisher-id): ct

Journal ID (coden): jctcce

Title: Journal of Chemical Theory and Computation

Publisher: American Chemical Society

ISSN (Print): 1549-9618

ISSN (Electronic): 1549-9626

Publication date (Electronic): 28 April 2022

Publication date (Print): 10 May 2022

Volume: 18

Issue: 5

Pages: 3052-3064

Affiliations

Institut de Ciència Molecular, Universitat de València , València 46071, Spain

Author notes

[* ]Email: luis.cerdan@ 123456uv.es .

[* ]Email: daniel.roca@ 123456uv.es .

Author information

Luis Cerdán https://orcid.org/0000-0002-7174-2453

Daniel Roca-Sanjuán https://orcid.org/0000-0001-6495-2770

Article

DOI: 10.1021/acs.jctc.2c00004

PMC ID: 9097286

PubMed ID: 35481363

SO-VID: 82573d1e-c4b6-417f-a634-2082696fb2e0

License:

Permits the broadest form of re-use including for commercial purposes, provided that author attribution and integrity are maintained ( https://creativecommons.org/licenses/by/4.0/).

History

Date received : 02 January 2022

Funding

Funded by: Ministerio de EconomÃa y Competitividad, doi 10.13039/501100003329;

Award ID: CTQ2017-87054-C2-2-P

Funded by: European Regional Development Fund, doi 10.13039/501100008530;

Award ID: CTQ2017-87054-C2-2-P

Funded by: Ministerio de Ciencia e InnovaciÃ³n, doi 10.13039/501100004837;

Award ID: RYC2015-19234

Custom metadata

document-id-old-9 ct2c00004

document-id-new-14 ct2c00004

ccc-price

ScienceOpen disciplines: Computational chemistry & Modeling

Data availability:

ScienceOpen disciplines: Computational chemistry & Modeling

Comments

Comment on this article

scite_

Cited by 4

See all cited by

Most referenced authors 549

See all reference authors

Reconstruction of Nuclear Ensemble Approach Electronic Spectra Using Probabilistic Machine Learning

Read this article at

Abstract

Related collections

ACS: COVID-19 Coronavirus

Most cited references 59

An Introduction to the Bootstrap

Anomaly detection: A survey

mclust 5: Clustering, Classification and Density Estimation Using Gaussian Finite Mixture Models.

Author and article information

Journal

Affiliations

Author notes

Author information

Article

History

Funding

Categories

Custom metadata

Comments

Comment on this article

Similar content 129

Cited by 4

Most referenced authors 549