81
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      ColabFold: making protein folding accessible to all

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          ColabFold offers accelerated prediction of protein structures and complexes by combining the fast homology search of MMseqs2 with AlphaFold2 or RoseTTAFold. ColabFold’s 40−60-fold faster search and optimized model utilization enables prediction of close to 1,000 structures per day on a server with one graphics processing unit. Coupled with Google Colaboratory, ColabFold becomes a free and accessible platform for protein folding. ColabFold is open-source software available at https://github.com/sokrypton/ColabFold and its novel environmental databases are available at https://colabfold.mmseqs.com.

          Abstract

          ColabFold is a free and accessible platform for protein folding that provides accelerated prediction of protein structures and complexes using AlphaFold2 or RoseTTAFold.

          Related collections

          Most cited references31

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Highly accurate protein structure prediction with AlphaFold

          Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous experimental effort 1 – 4 , the structures of around 100,000 unique proteins have been determined 5 , but this represents a small fraction of the billions of known protein sequences 6 , 7 . Structural coverage is bottlenecked by the months to years of painstaking effort required to determine a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence—the structure prediction component of the ‘protein folding problem’ 8 —has been an important open research problem for more than 50 years 9 . Despite recent progress 10 – 14 , existing methods fall far short of atomic accuracy, especially when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14) 15 , demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm. AlphaFold predicts protein structures with an accuracy competitive with experimental structures in the majority of cases using a novel deep learning architecture.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Matplotlib: A 2D Graphics Environment

              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              UniProt: a worldwide hub of protein knowledge

              (2018)
              Abstract The UniProt Knowledgebase is a collection of sequences and annotations for over 120 million proteins across all branches of life. Detailed annotations extracted from the literature by expert curators have been collected for over half a million of these proteins. These annotations are supplemented by annotations provided by rule based automated systems, and those imported from other resources. In this article we describe significant updates that we have made over the last 2 years to the resource. We have greatly expanded the number of Reference Proteomes that we provide and in particular we have focussed on improving the number of viral Reference Proteomes. The UniProt website has been augmented with new data visualizations for the subcellular localization of proteins as well as their structure and interactions. UniProt resources are available under a CC-BY (4.0) license via the web at https://www.uniprot.org/.
                Bookmark

                Author and article information

                Contributors
                milot.mirdita@mpinat.mpg.de
                so@fas.harvard.edu
                martin.steinegger@snu.ac.kr
                Journal
                Nat Methods
                Nat Methods
                Nature Methods
                Nature Publishing Group US (New York )
                1548-7091
                1548-7105
                30 May 2022
                30 May 2022
                2022
                : 19
                : 6
                : 679-682
                Affiliations
                [1 ]GRID grid.4372.2, ISNI 0000 0001 2105 1091, Quantitative and Computational Biology, , Max Planck Institute for Multidisciplinary Sciences, ; Göttingen, Germany
                [2 ]GRID grid.31501.36, ISNI 0000 0004 0470 5905, School of Biological Sciences, , Seoul National University, ; Seoul, South Korea
                [3 ]GRID grid.26999.3d, ISNI 0000 0001 2151 536X, Department of Biotechnology, Graduate School of Agricultural and Life Sciences, , The University of Tokyo, ; Tokyo, Japan
                [4 ]GRID grid.26999.3d, ISNI 0000 0001 2151 536X, Collaborative Research Institute for Innovative Microbiology, , The University of Tokyo, ; Tokyo, Japan
                [5 ]GRID grid.17088.36, ISNI 0000 0001 2150 1785, Department of Biochemistry and Molecular Biology, , Michigan State University, ; East Lansing, MI USA
                [6 ]GRID grid.38142.3c, ISNI 000000041936754X, JHDSF Program, , Harvard University, ; Cambridge, MA USA
                [7 ]GRID grid.38142.3c, ISNI 000000041936754X, FAS Division of Science, , Harvard University, ; Cambridge, MA USA
                [8 ]GRID grid.31501.36, ISNI 0000 0004 0470 5905, Artificial Intelligence Institute, , Seoul National University, ; Seoul, South Korea
                [9 ]GRID grid.31501.36, ISNI 0000 0004 0470 5905, Institute of Molecular Biology and Genetics, , Seoul National University, ; Seoul, South Korea
                Author information
                http://orcid.org/0000-0001-8637-6719
                http://orcid.org/0000-0002-3957-412X
                http://orcid.org/0000-0003-0448-9790
                http://orcid.org/0000-0002-3153-2363
                http://orcid.org/0000-0003-2774-2744
                http://orcid.org/0000-0001-8781-9753
                Article
                1488
                10.1038/s41592-022-01488-1
                9184281
                35637307
                64813b2f-8bcf-4570-9ebe-f50faec12cf2
                © The Author(s) 2022

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 29 October 2021
                : 11 April 2022
                Funding
                Funded by: FundRef https://doi.org/10.13039/501100002347, Bundesministerium für Bildung und Forschung (Federal Ministry of Education and Research);
                Award ID: horizontal4meta
                Award Recipient :
                Funded by: FundRef https://doi.org/10.13039/100009619, Japan Agency for Medical Research and Development (AMED);
                Award ID: JP21am0101107
                Award Recipient :
                Funded by: FundRef https://doi.org/10.13039/100000001, National Science Foundation (NSF);
                Award ID: MCB2032259
                Award Recipient :
                Funded by: NIH Grants DP5OD026389 NIH Grants R21AI156595 Moore–Simons 735929LPI
                Funded by: FundRef https://doi.org/10.13039/501100003725, National Research Foundation of Korea (NRF);
                Award ID: 2019R1A6A1A10073437,2020M3A9G7103933,2021R1C1C102065,2021M3A9I4021220
                Award Recipient :
                Funded by: FundRef https://doi.org/10.13039/501100002551, Seoul National University;
                Award ID: New Faculty Startup Fund
                Award ID: Creative-Pioneering Researchers Program
                Award Recipient :
                Categories
                Brief Communication
                Custom metadata
                © The Author(s), under exclusive licence to Springer Nature America, Inc. 2022

                Life sciences
                protein structure predictions,computational models,software,protein databases
                Life sciences
                protein structure predictions, computational models, software, protein databases

                Comments

                Comment on this article