19
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Scaffolding protein functional sites using deep learning

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The binding and catalytic functions of proteins are generally mediated by a small number of functional residues held in place by the overall protein structure. We describe deep learning approaches for scaffolding such functional sites without needing to pre-specify the fold or secondary structure of the scaffold. The first approach, “constrained hallucination”, optimizes sequences such that their predicted structures contain the desired functional site. The second approach, “inpainting”, starts from the functional site and fills in additional sequence and structure to create a viable protein scaffold in a single forward pass through a specifically trained RosettaFold network. We use the methods to design candidate immunogens, receptor traps, metalloproteins, enzymes, and protein-binding proteins, and validate the designs using a combination of in silico and experimental tests.

          Related collections

          Most cited references89

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Highly accurate protein structure prediction with AlphaFold

          Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous experimental effort 1 – 4 , the structures of around 100,000 unique proteins have been determined 5 , but this represents a small fraction of the billions of known protein sequences 6 , 7 . Structural coverage is bottlenecked by the months to years of painstaking effort required to determine a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence—the structure prediction component of the ‘protein folding problem’ 8 —has been an important open research problem for more than 50 years 9 . Despite recent progress 10 – 14 , existing methods fall far short of atomic accuracy, especially when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14) 15 , demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm. AlphaFold predicts protein structures with an accuracy competitive with experimental structures in the majority of cases using a novel deep learning architecture.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Basic local alignment search tool.

            A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score. Recent mathematical results on the stochastic properties of MSP scores allow an analysis of the performance of this method as well as the statistical significance of alignments it generates. The basic algorithm is simple and robust; it can be implemented in a number of ways and applied in a variety of contexts including straightforward DNA and protein sequence database searches, motif searches, gene identification searches, and in the analysis of multiple regions of similarity in long DNA sequences. In addition to its flexibility and tractability to mathematical analysis, BLAST is an order of magnitude faster than existing sequence comparison tools of comparable sensitivity.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              The Protein Data Bank.

              The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.
                Bookmark

                Author and article information

                Journal
                0404511
                7473
                Science
                Science
                Science (New York, N.Y.)
                0036-8075
                1095-9203
                15 October 2022
                22 July 2022
                21 July 2022
                31 October 2022
                : 377
                : 6604
                : 387-394
                Affiliations
                [a ]Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
                [b ]Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
                [c ]Graduate program in Biological Physics, Structure and Design, University of Washington, Seattle, WA 98105, USA
                [d ]FAS Division of Science, Harvard University, Cambridge, MA 02138, USA
                [e ]John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA 02138, USA
                [f ]Howard Hughes Medical Institute, University of Washington, Seattle, WA 98105, USA
                [g ]Molecular Engineering Graduate Program, University of Washington, Seattle, WA 98105, USA
                [h ]Institute of Bioengineering, École Polytechnique Fédérale de Lausanne, Lausanne CH-1015, Switzerland
                Author notes
                [†]

                These authors contributed equally to this work.

                Author contributions

                Designed the research: JW, SL, DJ, DT, JLW, SO, DB

                Developed the motif-constrained hallucination method: JW, DT, SL, IA, SO

                Contributed code and ideas for hallucination: MB, JD

                Generated designs using hallucination: JW, SL, DT, SO

                Developed the inpainting method: DJ, JLW

                Contributed code and ideas for inpainting: MB, JW, SL, DT

                Generated designs using inpainting: DJ, JLW, AS

                Analyzed data: JW, SL, DJ, DT, JLW, ME

                Trained neural networks: DJ, JLW, MB

                Performed RSV-F experiments: KMC, RR, LFM, JW

                Performed Di-iron experiments: JLW, DJ

                Performed EF-hand experiments: AS, JLW

                Performed PD-L1 experiments: WY, DRH, JW, SL, DJ

                Contributed reagents and technical expertise: TS, JHC, LFM, NB, BIMW, BC, AM, FD

                Wrote the manuscript: JW, DJ, JLW, SL, DT, SO, DB

                [* ]To whom correspondence should be addressed. dabaker@ 123456uw.edu , so@ 123456fas.harvard.edu
                Article
                NIHMS1830590
                10.1126/science.abn2100
                9621694
                35862514
                e995fb42-59a7-4d67-b6f4-754c6aa50b4c

                This work is licensed under a Creative Commons Attribution 4.0 International License , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.

                History
                Categories
                Article

                Uncategorized
                Uncategorized

                Comments

                Comment on this article

                scite_
                0
                0
                0
                0
                Smart Citations
                0
                0
                0
                0
                Citing PublicationsSupportingMentioningContrasting
                View Citations

                See how this article has been cited at scite.ai

                scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

                Similar content268

                Cited by141

                Most referenced authors6,396