11
views
0
comments
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      New machine learning and physics-based scoring functions for drug discovery

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Scoring functions are essential for modern in silico drug discovery. However, the accurate prediction of binding affinity by scoring functions remains a challenging task. The performance of scoring functions is very heterogeneous across different target classes. Scoring functions based on precise physics-based descriptors better representing protein–ligand recognition process are strongly needed. We developed a set of new empirical scoring functions, named DockTScore, by explicitly accounting for physics-based terms combined with machine learning. Target-specific scoring functions were developed for two important drug targets, proteases and protein–protein interactions, representing an original class of molecules for drug discovery. Multiple linear regression (MLR), support vector machine and random forest algorithms were employed to derive general and target-specific scoring functions involving optimized MMFF94S force-field terms, solvation and lipophilic interactions terms, and an improved term accounting for ligand torsional entropy contribution to ligand binding. DockTScore scoring functions demonstrated to be competitive with the current best-evaluated scoring functions in terms of binding energy prediction and ranking on four DUD-E datasets and will be useful for in silico drug design for diverse proteins as well as for specific targets such as proteases and protein–protein interactions. Currently, the MLR DockTScore is available at www.dockthor.lncc.br.

          Related collections

          Most cited references78

          • Record: found
          • Abstract: not found
          • Article: not found

          Random Forests

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Protein and ligand preparation: parameters, protocols, and influence on virtual screening enrichments.

            Structure-based virtual screening plays an important role in drug discovery and complements other screening approaches. In general, protein crystal structures are prepared prior to docking in order to add hydrogen atoms, optimize hydrogen bonds, remove atomic clashes, and perform other operations that are not part of the x-ray crystal structure refinement process. In addition, ligands must be prepared to create 3-dimensional geometries, assign proper bond orders, and generate accessible tautomer and ionization states prior to virtual screening. While the prerequisite for proper system preparation is generally accepted in the field, an extensive study of the preparation steps and their effect on virtual screening enrichments has not been performed. In this work, we systematically explore each of the steps involved in preparing a system for virtual screening. We first explore a large number of parameters using the Glide validation set of 36 crystal structures and 1,000 decoys. We then apply a subset of protocols to the DUD database. We show that database enrichment is improved with proper preparation and that neglecting certain steps of the preparation process produces a systematic degradation in enrichments, which can be large for some targets. We provide examples illustrating the structural changes introduced by the preparation that impact database enrichment. While the work presented here was performed with the Protein Preparation Wizard and Glide, the insights and guidance are expected to be generalizable to structure-based virtual screening with other docking methods.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              PROPKA3: Consistent Treatment of Internal and Surface Residues in Empirical pKa Predictions.

              In this study, we have revised the rules and parameters for one of the most commonly used empirical pKa predictors, PROPKA, based on better physical description of the desolvation and dielectric response for the protein. We have introduced a new and consistent approach to interpolate the description between the previously distinct classifications into internal and surface residues, which otherwise is found to give rise to an erratic and discontinuous behavior. Since the goal of this study is to lay out the framework and validate the concept, it focuses on Asp and Glu residues where the protein pKa values and structures are assumed to be more reliable. The new and improved implementation is evaluated and discussed; it is found to agree better with experiment than the previous implementation (in parentheses): rmsd = 0.79 (0.91) for Asp and Glu, 0.75 (0.97) for Tyr, 0.65 (0.72) for Lys, and 1.00 (1.37) for His residues. The most significant advance, however, is in reducing the number of outliers and removing unreasonable sensitivity to small structural changes that arise from classifying residues as either internal or surface.
                Bookmark

                Author and article information

                Contributors
                dardenne@lncc.br
                maria.mitev@inserm.fr
                Journal
                Sci Rep
                Sci Rep
                Scientific Reports
                Nature Publishing Group UK (London )
                2045-2322
                4 February 2021
                4 February 2021
                2021
                : 11
                : 3198
                Affiliations
                [1 ]GRID grid.452576.7, ISNI 0000 0004 0602 9007, Laboratório Nacional de Computação Científica, ; Petrópolis, 25651-075 Brazil
                [2 ]GRID grid.418068.3, ISNI 0000 0001 0723 0931, Fundação Oswaldo Cruz, ; Rio de Janeiro, 21040-361 Brazil
                [3 ]GRID grid.508487.6, ISNI 0000 0004 7885 7602, Inserm U973, , Université Paris Diderot, ; Paris, France
                [4 ]GRID grid.428999.7, ISNI 0000 0001 2353 6535, Structural Bioinformatics Unit, CNRS UMR3528, , Institut Pasteur, ; 75015 Paris, France
                [5 ]Inserm U1268 “Medicinal Chemistry and Translational Research”, CiTCoM, UMR 8038, CNRS, Université de Paris, 75006 Paris, France
                Article
                82410
                10.1038/s41598-021-82410-1
                7862620
                33542326
                f2b2bcf3-9f4d-4b9a-9058-b90140caf51c
                © The Author(s) 2021

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 2 November 2020
                : 20 January 2021
                Funding
                Funded by: CNPq
                Award ID: 307634/2019-1 and 306894/2019-0
                Award ID: 307634/2019-1 and 306894/2019-0
                Award ID: 307634/2019-1 and 306894/2019-0
                Award ID: 307634/2019-1 and 306894/2019-0
                Award ID: 307634/2019-1 and 306894/2019-0
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/501100004586, Fundação Carlos Chagas Filho de Amparo à Pesquisa do Estado do Rio de Janeiro;
                Award ID: E-26/010.001229/2015 and E-26/210.935/2019
                Award ID: E-26/010.001229/2015 and E-26/210.935/2019
                Award ID: E-26/010.001229/2015 and E-26/210.935/2019
                Award ID: E-26/010.001229/2015 and E-26/210.935/2019
                Award ID: E-26/010.001229/2015 and E-26/210.935/2019
                Award Recipient :
                Funded by: PCI-LNCC
                Funded by: FundRef http://dx.doi.org/10.13039/501100001677, Institut National de la Santé et de la Recherche Médicale;
                Funded by: FundRef http://dx.doi.org/10.13039/501100005736, Université Paris Diderot;
                Funded by: FundRef http://dx.doi.org/10.13039/501100001665, Agence Nationale de la Recherche;
                Award ID: ToxME
                Award Recipient :
                Funded by: Univ. Paris
                Categories
                Article
                Custom metadata
                © The Author(s) 2021

                Uncategorized
                drug discovery,computational biophysics,cheminformatics
                Uncategorized
                drug discovery, computational biophysics, cheminformatics

                Comments

                Comment on this article