1
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Reinvent 4: Modern AI–driven generative molecule design

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          REINVENT 4 is a modern open-source generative AI framework for the design of small molecules. The software utilizes recurrent neural networks and transformer architectures to drive molecule generation. These generators are seamlessly embedded within the general machine learning optimization algorithms, transfer learning, reinforcement learning and curriculum learning. REINVENT 4 enables and facilitates de novo design, R-group replacement, library design, linker design, scaffold hopping and molecule optimization. This contribution gives an overview of the software and describes its design. Algorithms and their applications are discussed in detail. REINVENT 4 is a command line tool which reads a user configuration in either TOML or JSON format. The aim of this release is to provide reference implementations for some of the most common algorithms in AI based molecule generation. An additional goal with the release is to create a framework for education and future innovation in AI based molecular design. The software is available from https://github.com/MolecularAI/REINVENT4 and released under the permissive Apache 2.0 license. Scientific contribution. The software provides an open–source reference implementation for generative molecular design where the software is also being used in production to support in–house drug discovery projects. The publication of the most common machine learning algorithms in one code and full documentation thereof will increase transparency of AI and foster innovation, collaboration and education.

          Supplementary Information

          The online version contains supplementary material available at 10.1186/s13321-024-00812-5.

          Related collections

          Most cited references71

          • Record: found
          • Abstract: found
          • Article: not found

          AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading.

          AutoDock Vina, a new program for molecular docking and virtual screening, is presented. AutoDock Vina achieves an approximately two orders of magnitude speed-up compared with the molecular docking software previously developed in our lab (AutoDock 4), while also significantly improving the accuracy of the binding mode predictions, judging by our tests on the training set used in AutoDock 4 development. Further speed-up is achieved from parallelism, by using multithreading on multicore machines. AutoDock Vina automatically calculates the grid maps and clusters the results in a way transparent to the user. Copyright 2009 Wiley Periodicals, Inc.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings.

            Experimental and computational approaches to estimate solubility and permeability in discovery and development settings are described. In the discovery setting 'the rule of 5' predicts that poor absorption or permeation is more likely when there are more than 5 H-bond donors, 10 H-bond acceptors, the molecular weight (MWT) is greater than 500 and the calculated Log P (CLogP) is greater than 5 (or MlogP > 4.15). Computational methodology for the rule-based Moriguchi Log P (MLogP) calculation is described. Turbidimetric solubility measurement is described and applied to known drugs. High throughput screening (HTS) leads tend to have higher MWT and Log P and lower turbidimetric solubility than leads in the pre-HTS era. In the development setting, solubility calculations focus on exact value prediction and are difficult because of polymorphism. Recent work on linear free energy relationships and Log P approaches are critically reviewed. Useful predictions are possible in closely related analog series when coupled with experimental thermodynamic solubility measurements.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              ChEMBL: towards direct deposition of bioassay data

              Abstract ChEMBL is a large, open-access bioactivity database (https://www.ebi.ac.uk/chembl), previously described in the 2012, 2014 and 2017 Nucleic Acids Research Database Issues. In the last two years, several important improvements have been made to the database and are described here. These include more robust capture and representation of assay details; a new data deposition system, allowing updating of data sets and deposition of supplementary data; and a completely redesigned web interface, with enhanced search and filtering capabilities.
                Bookmark

                Author and article information

                Contributors
                hannes.loffler@astrazeneca.com
                Journal
                J Cheminform
                J Cheminform
                Journal of Cheminformatics
                Springer International Publishing (Cham )
                1758-2946
                21 February 2024
                21 February 2024
                2024
                : 16
                : 20
                Affiliations
                [1 ]Molecular AI, Discovery Sciences, R&D, AstraZeneca, ( https://ror.org/04wwrrg31) Gothenburg, Sweden
                [2 ]GRID grid.417815.e, ISNI 0000 0004 5929 4381, Molecular AI, Discovery Sciences, R&D, , AstraZeneca, ; Cambridge, UK
                Article
                812
                10.1186/s13321-024-00812-5
                10882833
                38383444
                2332ef1b-bd65-4c7a-8052-c7614add111d
                © The Author(s) 2024

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

                History
                : 23 November 2023
                : 9 February 2024
                Categories
                Software
                Custom metadata
                © Springer Nature Switzerland AG 2024

                Chemoinformatics
                generative ai,reinforcement learning,transfer learning,multi parameter optimization,recurrent neural networks,transformers

                Comments

                Comment on this article