6
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Accurate prediction of protein–nucleic acid complexes using RoseTTAFoldNA

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Protein–RNA and protein–DNA complexes play critical roles in biology. Despite considerable recent advances in protein structure prediction, the prediction of the structures of protein–nucleic acid complexes without homology to known complexes is a largely unsolved problem. Here we extend the RoseTTAFold machine learning protein-structure-prediction approach to additionally predict nucleic acid and protein–nucleic acid complexes. We develop a single trained network, RoseTTAFoldNA, that rapidly produces three-dimensional structure models with confidence estimates for protein–DNA and protein–RNA complexes. Here we show that confident predictions have considerably higher accuracy than current state-of-the-art methods. RoseTTAFoldNA should be broadly useful for modeling the structure of naturally occurring protein–nucleic acid complexes, and for designing sequence-specific RNA and DNA-binding proteins.

          Abstract

          RoseTTAFoldNA extends the RoseTTAFold2 platform to predict the structures of protein–DNA and protein–RNA complexes.

          Related collections

          Most cited references33

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Highly accurate protein structure prediction with AlphaFold

          Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous experimental effort 1 – 4 , the structures of around 100,000 unique proteins have been determined 5 , but this represents a small fraction of the billions of known protein sequences 6 , 7 . Structural coverage is bottlenecked by the months to years of painstaking effort required to determine a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence—the structure prediction component of the ‘protein folding problem’ 8 —has been an important open research problem for more than 50 years 9 . Despite recent progress 10 – 14 , existing methods fall far short of atomic accuracy, especially when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14) 15 , demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm. AlphaFold predicts protein structures with an accuracy competitive with experimental structures in the majority of cases using a novel deep learning architecture.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

            S Altschul (1997)
            The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSI-BLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Accurate prediction of protein structures and interactions using a 3-track neural network

              DeepMind presented remarkably accurate predictions at the recent CASP14 protein structure prediction assessment conference. We explored network architectures incorporating related ideas and obtained the best performance with a 3-track network in which information at the 1D sequence level, the 2D distance map level, and the 3D coordinate level is successively transformed and integrated. The 3-track network produces structure predictions with accuracies approaching those of DeepMind in CASP14, enables the rapid solution of challenging X-ray crystallography and cryo-EM structure modeling problems, and provides insights into the functions of proteins of currently unknown structure. The network also enables rapid generation of accurate protein-protein complex models from sequence information alone, short circuiting traditional approaches which require modeling of individual subunits followed by docking. We make the method available to the scientific community to speed biological research.
                Bookmark

                Author and article information

                Contributors
                dimaio@uw.edu
                Journal
                Nat Methods
                Nat Methods
                Nature Methods
                Nature Publishing Group US (New York )
                1548-7091
                1548-7105
                23 November 2023
                23 November 2023
                2024
                : 21
                : 1
                : 117-121
                Affiliations
                [1 ]School of Biological Sciences, Seoul National University, ( https://ror.org/04h9pn542) Seoul, Republic of Korea
                [2 ]Department of Biochemistry, University of Washington, ( https://ror.org/00cvxb145) Seattle, WA USA
                [3 ]Institute for Protein Design, University of Washington, ( https://ror.org/00cvxb145) Seattle, WA USA
                [4 ]GRID grid.47840.3f, ISNI 0000 0001 2181 7878, Department of Electrical Engineering and Computer Sciences, , University of California, ; Berkeley, CA USA
                [5 ]GRID grid.34477.33, ISNI 0000000122986657, Howard Hughes Medical Institute, , University of Washington, ; Seattle, WA USA
                Author information
                http://orcid.org/0000-0003-3414-9404
                http://orcid.org/0000-0003-0291-2196
                http://orcid.org/0000-0003-3645-2044
                http://orcid.org/0000-0001-7896-6217
                http://orcid.org/0000-0002-7524-8938
                Article
                2086
                10.1038/s41592-023-02086-5
                10776382
                37996753
                193a67c6-3643-40dd-9d56-f4c5151a2a9a
                © The Author(s) 2023

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 12 January 2023
                : 16 October 2023
                Funding
                Funded by: FundRef https://doi.org/10.13039/100000057, U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences (NIGMS);
                Award ID: GM123089
                Award Recipient :
                Categories
                Article
                Custom metadata
                © Springer Nature America, Inc. 2024

                Life sciences
                machine learning,dna-binding proteins,rna-binding proteins
                Life sciences
                machine learning, dna-binding proteins, rna-binding proteins

                Comments

                Comment on this article