9
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      RCSB Protein Data Bank (RCSB.org): delivery of experimentally-determined PDB structures alongside one million computed structure models of proteins from artificial intelligence/machine learning

      research-article
      , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
      Nucleic Acids Research
      Oxford University Press

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), founding member of the Worldwide Protein Data Bank (wwPDB), is the US data center for the open-access PDB archive. As wwPDB-designated Archive Keeper, RCSB PDB is also responsible for PDB data security. Annually, RCSB PDB serves >10 000 depositors of three-dimensional (3D) biostructures working on all permanently inhabited continents. RCSB PDB delivers data from its research-focused RCSB.org web portal to many millions of PDB data consumers based in virtually every United Nations-recognized country, territory, etc. This Database Issue contribution describes upgrades to the research-focused RCSB.org web portal that created a one-stop-shop for open access to ∼200 000 experimentally-determined PDB structures of biological macromolecules alongside >1 000 000 incorporated Computed Structure Models (CSMs) predicted using artificial intelligence/machine learning methods. RCSB.org is a ‘living data resource.’ Every PDB structure and CSM is integrated weekly with related functional annotations from external biodata resources, providing up-to-date information for the entire corpus of 3D biostructure data freely available from RCSB.org with no usage limitations. Within RCSB.org, PDB structures and the CSMs are clearly identified as to their provenance and reliability. Both are fully searchable, and can be analyzed and visualized using the full complement of RCSB.org web portal capabilities.

          Graphical Abstract

          Graphical Abstract

          RCSB.org now delivers ∼200 000 experimentally-determined PDB structures alongside >1M Computed Structure Models that can all be searched, analyzed, visualized, and explored using custom tools and features.

          Related collections

          Most cited references135

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Highly accurate protein structure prediction with AlphaFold

          Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous experimental effort 1 – 4 , the structures of around 100,000 unique proteins have been determined 5 , but this represents a small fraction of the billions of known protein sequences 6 , 7 . Structural coverage is bottlenecked by the months to years of painstaking effort required to determine a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence—the structure prediction component of the ‘protein folding problem’ 8 —has been an important open research problem for more than 50 years 9 . Despite recent progress 10 – 14 , existing methods fall far short of atomic accuracy, especially when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14) 15 , demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm. AlphaFold predicts protein structures with an accuracy competitive with experimental structures in the majority of cases using a novel deep learning architecture.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            The Protein Data Bank.

            The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              The FAIR Guiding Principles for scientific data management and stewardship

              There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders—representing academia, industry, funding agencies, and scholarly publishers—have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.
                Bookmark

                Author and article information

                Contributors
                Journal
                Nucleic Acids Res
                Nucleic Acids Res
                nar
                Nucleic Acids Research
                Oxford University Press
                0305-1048
                1362-4962
                06 January 2023
                24 November 2022
                24 November 2022
                : 51
                : D1
                : D488-D508
                Affiliations
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Rutgers Cancer Institute of New Jersey , New Brunswick, NJ 08901, USA
                Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego , La Jolla, CA 92093, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego , La Jolla, CA 92093, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego , La Jolla, CA 92093, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego , La Jolla, CA 92093, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                School of Chemistry and Materials Science, Rochester Institute of Technology , Rochester, NY 14623, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego , La Jolla, CA 92093, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Rutgers Cancer Institute of New Jersey , New Brunswick, NJ 08901, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences Institute, University of California San Francisco , San Francisco, CA 94158, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Rutgers Cancer Institute of New Jersey , New Brunswick, NJ 08901, USA
                Department of Integrative Structural and Computational Biology, The Scripps Research Institute , La Jolla, CA 92037, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego , La Jolla, CA 92093, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego , La Jolla, CA 92093, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego , La Jolla, CA 92093, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences Institute, University of California San Francisco , San Francisco, CA 94158, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego , La Jolla, CA 92093, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences Institute, University of California San Francisco , San Francisco, CA 94158, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Rutgers Cancer Institute of New Jersey , New Brunswick, NJ 08901, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences Institute, University of California San Francisco , San Francisco, CA 94158, USA
                Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
                Author notes
                To whom correspondence should be addressed. Tel: +1 848 445 0103; Email: Stephen.Burley@ 123456RCSB.org

                Deceased.

                Author information
                https://orcid.org/0000-0002-2487-9713
                https://orcid.org/0000-0003-3576-0387
                https://orcid.org/0000-0003-0512-1031
                https://orcid.org/0000-0002-9544-5621
                https://orcid.org/0000-0002-1788-6579
                https://orcid.org/0000-0001-6121-9442
                https://orcid.org/0000-0002-0574-2041
                https://orcid.org/0000-0002-7905-6327
                https://orcid.org/0000-0001-9544-8390
                https://orcid.org/0000-0002-2694-7003
                https://orcid.org/0000-0001-6817-7476
                https://orcid.org/0000-0003-3103-7781
                https://orcid.org/0000-0002-6686-5475
                https://orcid.org/0000-0001-8896-6878
                https://orcid.org/0000-0002-4149-1745
                Article
                gkac1077
                10.1093/nar/gkac1077
                9825554
                36420884
                f8f71203-02b7-4864-ab06-9e6c812b549a
                © The Author(s) 2022. Published by Oxford University Press on behalf of Nucleic Acids Research.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 02 November 2022
                : 17 October 2022
                : 30 September 2022
                Page count
                Pages: 21
                Funding
                Funded by: National Science Foundation, DOI 10.13039/100000001;
                Award ID: DBI-1832184
                Funded by: U.S. Department of Energy, DOI 10.13039/100000015;
                Award ID: DE-SC0019749
                Funded by: National Cancer Institute, DOI 10.13039/100000054;
                Funded by: National Institute of Allergy and Infectious Diseases, DOI 10.13039/100000060;
                Funded by: National Institutes of Health, DOI 10.13039/100000002;
                Award ID: R01GM133198
                Funded by: UK Biotechnology and Biological Research Council;
                Award ID: DBI-2019297
                Funded by: NSF, DOI 10.13039/100000001;
                Award ID: DBI-1756248
                Award ID: DBI-2112966
                Funded by: NIH-NIGMS;
                Award ID: R01GM083960
                Award ID: P41GM109824
                Categories
                AcademicSubjects/SCI00010
                Database Issue

                Genetics
                Genetics

                Comments

                Comment on this article