Scaffolding protein functional sites using deep learning

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

The binding and catalytic functions of proteins are generally mediated by a small number of functional residues held in place by the overall protein structure. We describe deep learning approaches for scaffolding such functional sites without needing to pre-specify the fold or secondary structure of the scaffold. The first approach, “constrained hallucination”, optimizes sequences such that their predicted structures contain the desired functional site. The second approach, “inpainting”, starts from the functional site and fills in additional sequence and structure to create a viable protein scaffold in a single forward pass through a specifically trained RosettaFold network. We use the methods to design candidate immunogens, receptor traps, metalloproteins, enzymes, and protein-binding proteins, and validate the designs using a combination of in silico and experimental tests.

Related collections

Most cited references 89

Record: found
Abstract: found
Article: found

Is Open Access

Highly accurate protein structure prediction with AlphaFold

John Jumper, Richard Evans, Alexander Pritzel … (2021)

Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous experimental effort 1 – 4 , the structures of around 100,000 unique proteins have been determined 5 , but this represents a small fraction of the billions of known protein sequences 6 , 7 . Structural coverage is bottlenecked by the months to years of painstaking effort required to determine a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence—the structure prediction component of the ‘protein folding problem’ 8 —has been an important open research problem for more than 50 years 9 . Despite recent progress 10 – 14 , existing methods fall far short of atomic accuracy, especially when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14) 15 , demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm. AlphaFold predicts protein structures with an accuracy competitive with experimental structures in the majority of cases using a novel deep learning architecture.

0 comments Cited 12770 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Basic local alignment search tool.

Stephen F Altschul, Warren Gish, Webb Miller … (1990)

A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score. Recent mathematical results on the stochastic properties of MSP scores allow an analysis of the performance of this method as well as the statistical significance of alignments it generates. The basic algorithm is simple and robust; it can be implemented in a number of ways and applied in a variety of contexts including straightforward DNA and protein sequence database searches, motif searches, gene identification searches, and in the analysis of multiple regions of similarity in long DNA sequences. In addition to its flexibility and tractability to mathematical analysis, BLAST is an order of magnitude faster than existing sequence comparison tools of comparable sensitivity.

0 comments Cited 10746 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

The Protein Data Bank.

H M Berman, J Westbrook, Z Feng … (2000)

The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.

0 comments Cited 4530 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-journal-id): 0404511

Journal ID (pubmed-jr-id): 7473

Journal ID (nlm-ta): Science

Journal ID (iso-abbrev): Science

Title: Science (New York, N.Y.)

ISSN (Print): 0036-8075

ISSN (Electronic): 1095-9203

Publication date Nihms-submitted: 15 October 2022

Publication date (Print): 22 July 2022

Publication date (Electronic): 21 July 2022

Publication date PMC-release: 31 October 2022

Volume: 377

Issue: 6604

Pages: 387-394

Affiliations

[a ]Department of Biochemistry, University of Washington, Seattle, WA 98105, USA

[b ]Institute for Protein Design, University of Washington, Seattle, WA 98105, USA

[c ]Graduate program in Biological Physics, Structure and Design, University of Washington, Seattle, WA 98105, USA

[d ]FAS Division of Science, Harvard University, Cambridge, MA 02138, USA

[e ]John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA 02138, USA

[f ]Howard Hughes Medical Institute, University of Washington, Seattle, WA 98105, USA

[g ]Molecular Engineering Graduate Program, University of Washington, Seattle, WA 98105, USA

[h ]Institute of Bioengineering, École Polytechnique Fédérale de Lausanne, Lausanne CH-1015, Switzerland

Author notes

[†]

These authors contributed equally to this work.

Author contributions

Designed the research: JW, SL, DJ, DT, JLW, SO, DB

Developed the motif-constrained hallucination method: JW, DT, SL, IA, SO

Contributed code and ideas for hallucination: MB, JD

Generated designs using hallucination: JW, SL, DT, SO

Developed the inpainting method: DJ, JLW

Contributed code and ideas for inpainting: MB, JW, SL, DT

Generated designs using inpainting: DJ, JLW, AS

Analyzed data: JW, SL, DJ, DT, JLW, ME

Trained neural networks: DJ, JLW, MB

Performed RSV-F experiments: KMC, RR, LFM, JW

Performed Di-iron experiments: JLW, DJ

Performed EF-hand experiments: AS, JLW

Performed PD-L1 experiments: WY, DRH, JW, SL, DJ

Contributed reagents and technical expertise: TS, JHC, LFM, NB, BIMW, BC, AM, FD

Wrote the manuscript: JW, DJ, JLW, SL, DT, SO, DB

[* ]To whom correspondence should be addressed. dabaker@ 123456uw.edu , so@ 123456fas.harvard.edu

Article

Manuscript ID: NIHMS1830590

DOI: 10.1126/science.abn2100

PMC ID: 9621694

PubMed ID: 35862514

SO-VID: e995fb42-59a7-4d67-b6f4-754c6aa50b4c

License:

This work is licensed under a Creative Commons Attribution 4.0 International License , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.

History

Comments

Comment on this article

scite_

Smart Citations

Citing PublicationsSupportingMentioningContrasting

View Citations

See how this article has been cited at scite.ai

scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

Scaffolding protein functional sites using deep learning

Read this article at

Abstract

Related collections

Functional role of amyloid

Most cited references 89

Highly accurate protein structure prediction with AlphaFold

Basic local alignment search tool.

The Protein Data Bank.

Author and article information

Journal

Affiliations

Author notes

Article

History

Categories

Comments

Comment on this article

Similar content 268

Cited by 141

Most referenced authors 6,396