0
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Machine Learning of Reaction Properties via Learned Representations of the Condensed Graph of Reaction

      review-article
      ,
      Journal of Chemical Information and Modeling
      American Chemical Society

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The estimation of chemical reaction properties such as activation energies, rates, or yields is a central topic of computational chemistry. In contrast to molecular properties, where machine learning approaches such as graph convolutional neural networks (GCNNs) have excelled for a wide variety of tasks, no general and transferable adaptations of GCNNs for reactions have been developed yet. We therefore combined a popular cheminformatics reaction representation, the so-called condensed graph of reaction (CGR), with a recent GCNN architecture to arrive at a versatile, robust, and compact deep learning model. The CGR is a superposition of the reactant and product graphs of a chemical reaction and thus an ideal input for graph-based machine learning approaches. The model learns to create a data-driven, task-dependent reaction embedding that does not rely on expert knowledge, similar to current molecular GCNNs. Our approach outperforms current state-of-the-art models in accuracy, is applicable even to imbalanced reactions, and possesses excellent predictive capabilities for diverse target properties, such as activation energies, reaction enthalpies, rate constants, yields, or reaction classes. We furthermore curated a large set of atom-mapped reactions along with their target properties, which can serve as benchmark data sets for future work. All data sets and the developed reaction GCNN model are available online, free of charge, and open source.

          Related collections

          Most cited references57

          • Record: found
          • Abstract: found
          • Article: not found

          Extended-connectivity fingerprints.

          Extended-connectivity fingerprints (ECFPs) are a novel class of topological fingerprints for molecular characterization. Historically, topological fingerprints were developed for substructure and similarity searching. ECFPs were developed specifically for structure-activity modeling. ECFPs are circular fingerprints with a number of useful qualities: they can be very rapidly calculated; they are not predefined and can represent an essentially infinite number of different molecular features (including stereochemical information); their features represent the presence of particular substructures, allowing easier interpretation of analysis results; and the ECFP algorithm can be tailored to generate different types of circular fingerprints, optimized for different uses. While the use of ECFPs has been widely adopted and validated, a description of their implementation has not previously been presented in the literature.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            SchNet – A deep learning architecture for molecules and materials

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Planning chemical syntheses with deep neural networks and symbolic AI

              To plan the syntheses of small organic molecules, chemists use retrosynthesis, a problem-solving technique in which target molecules are recursively transformed into increasingly simpler precursors. Computer-aided retrosynthesis would be a valuable tool but at present it is slow and provides results of unsatisfactory quality. Here we use Monte Carlo tree search and symbolic artificial intelligence (AI) to discover retrosynthetic routes. We combined Monte Carlo tree search with an expansion policy network that guides the search, and a filter network to pre-select the most promising retrosynthetic steps. These deep neural networks were trained on essentially all reactions ever published in organic chemistry. Our system solves for almost twice as many molecules, thirty times faster than the traditional computer-aided search method, which is based on extracted rules and hand-designed heuristics. In a double-blind AB test, chemists on average considered our computer-generated routes to be equivalent to reported literature routes.
                Bookmark

                Author and article information

                Journal
                J Chem Inf Model
                J Chem Inf Model
                ci
                jcisd8
                Journal of Chemical Information and Modeling
                American Chemical Society
                1549-9596
                1549-960X
                04 November 2021
                09 May 2022
                : 62
                : 9 , From Reaction Informatics to Chemical Space
                : 2101-2110
                Affiliations
                [1]Department of Chemical Engineering, Massachusetts Institute of Technology , Cambridge, Massachusetts 02139, United States
                Author notes
                Author information
                https://orcid.org/0000-0002-8404-6596
                https://orcid.org/0000-0003-2603-9694
                Article
                10.1021/acs.jcim.1c00975
                9092344
                34734699
                e8d33293-8cbd-4834-9534-bdb3f9159d17
                © 2021 The Authors. Published by American Chemical Society

                Permits the broadest form of re-use including for commercial purposes, provided that author attribution and integrity are maintained ( https://creativecommons.org/licenses/by/4.0/).

                History
                : 11 August 2021
                Funding
                Funded by: Austrian Science Fund, doi 10.13039/501100002428;
                Award ID: J 4415
                Funded by: Machine Learning for Pharmaceutical Discovery and Synthesis Consortium (MLPDS), doi NA;
                Award ID: NA
                Categories
                Article
                Custom metadata
                ci1c00975
                ci1c00975

                Computational chemistry & Modeling
                Computational chemistry & Modeling

                Comments

                Comment on this article