45
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Protein structure prediction using multiple deep neural networks in the 13th Critical Assessment of Protein Structure Prediction (CASP13)

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We describe AlphaFold, the protein structure prediction system that was entered by the group A7D in CASP13. Submissions were made by three free‐modeling (FM) methods which combine the predictions of three neural networks. All three systems were guided by predictions of distances between pairs of residues produced by a neural network. Two systems assembled fragments produced by a generative neural network, one using scores from a network trained to regress GDT_TS. The third system shows that simple gradient descent on a properly constructed potential is able to perform on par with more expensive traditional search techniques and without requiring domain segmentation. In the CASP13 FM assessors' ranking by summed z‐scores, this system scored highest with 68.3 vs 48.2 for the next closest group (an average GDT_TS of 61.4). The system produced high‐accuracy structures (with GDT_TS scores of 70 or higher) for 11 out of 43 FM domains. Despite not explicitly using template information, the results in the template category were comparable to the best performing template‐based methods.

          Related collections

          Most cited references24

          • Record: found
          • Abstract: found
          • Article: not found

          Direct-coupling analysis of residue coevolution captures native contacts across many protein families.

          The similarity in the three-dimensional structures of homologous proteins imposes strong constraints on their sequence variability. It has long been suggested that the resulting correlations among amino acid compositions at different sequence positions can be exploited to infer spatial contacts within the tertiary protein structure. Crucial to this inference is the ability to disentangle direct and indirect correlations, as accomplished by the recently introduced direct-coupling analysis (DCA). Here we develop a computationally efficient implementation of DCA, which allows us to evaluate the accuracy of contact prediction by DCA for a large number of protein domains, based purely on sequence information. DCA is shown to yield a large number of correctly predicted contacts, recapitulating the global structure of the contact map for the majority of the protein domains examined. Furthermore, our analysis captures clear signals beyond intradomain residue contacts, arising, e.g., from alternative protein conformations, ligand-mediated residue couplings, and interdomain interactions in protein oligomers. Our findings suggest that contacts predicted by DCA can be used as a reliable guide to facilitate computational predictions of alternative protein conformations, protein complex formation, and even the de novo prediction of protein domain structures, contingent on the existence of a large number of homologous sequences which are being rapidly made available due to advances in genome sequencing.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Monte Carlo Sampling Methods Using Markov Chains and Their Applications

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions.

              We explore the ability of a simple simulated annealing procedure to assemble native-like structures from fragments of unrelated protein structures with similar local sequences using Bayesian scoring functions. Environment and residue pair specific contributions to the scoring functions appear as the first two terms in a series expansion for the residue probability distributions in the protein database; the decoupling of the distance and environment dependencies of the distributions resolves the major problems with current database-derived scoring functions noted by Thomas and Dill. The simulated annealing procedure rapidly and frequently generates native-like structures for small helical proteins and better than random structures for small beta sheet containing proteins. Most of the simulated structures have native-like solvent accessibility and secondary structure patterns, and thus ensembles of these structures provide a particularly challenging set of decoys for evaluating scoring functions. We investigate the effects of multiple sequence information and different types of conformational constraints on the overall performance of the method, and the ability of a variety of recently developed scoring functions to recognize the native-like conformations in the ensembles of simulated structures.
                Bookmark

                Author and article information

                Contributors
                andrewsenior@google.com
                Journal
                Proteins
                Proteins
                10.1002/(ISSN)1097-0134
                PROT
                Proteins
                John Wiley & Sons, Inc. (Hoboken, USA )
                0887-3585
                1097-0134
                11 November 2019
                December 2019
                : 87
                : 12 , Critical Assessment of Methods of Protein Structure Prediction (CASP) Special Issue ( doiID: 10.1002/prot.v87.12 )
                : 1141-1148
                Affiliations
                [ 1 ] DeepMind London UK
                [ 2 ] The Francis Crick Institute London UK
                [ 3 ] University College London London UK
                Author notes
                [*] [* ] Correspondence

                Andrew W. Senior, DeepMind, 6 Pancras Square, London, N1C 4AG, UK.

                Email: andrewsenior@ 123456google.com

                Author information
                https://orcid.org/0000-0002-2401-5691
                Article
                PROT25834
                10.1002/prot.25834
                7079254
                31602685
                4b785eaf-f585-4b0d-87b4-70cfef308684
                © 2019 The Authors. Proteins: Structure, Function, and Bioinformatics published by Wiley Periodicals, Inc.

                This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc-nd/4.0/ License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made.

                History
                : 03 May 2019
                : 25 September 2019
                : 27 September 2019
                Page count
                Figures: 7, Tables: 1, Pages: 8, Words: 5279
                Categories
                Research Article
                3d Structure Modeling
                Research Articles
                Custom metadata
                2.0
                December 2019
                Converter:WILEY_ML3GV2_TO_JATSPMC version:5.7.8 mode:remove_FC converted:18.03.2020

                Biochemistry
                casp,deep learning,machine learning,protein structure prediction
                Biochemistry
                casp, deep learning, machine learning, protein structure prediction

                Comments

                Comment on this article