10
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      For long-term sustainable software in bioinformatics

      discussion
      1 , 2 , * ,
      PLOS Computational Biology
      Public Library of Science

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Related collections

          Most cited references29

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Highly accurate protein structure prediction with AlphaFold

          Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous experimental effort 1 – 4 , the structures of around 100,000 unique proteins have been determined 5 , but this represents a small fraction of the billions of known protein sequences 6 , 7 . Structural coverage is bottlenecked by the months to years of painstaking effort required to determine a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence—the structure prediction component of the ‘protein folding problem’ 8 —has been an important open research problem for more than 50 years 9 . Despite recent progress 10 – 14 , existing methods fall far short of atomic accuracy, especially when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14) 15 , demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm. AlphaFold predicts protein structures with an accuracy competitive with experimental structures in the majority of cases using a novel deep learning architecture.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Basic local alignment search tool.

            A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score. Recent mathematical results on the stochastic properties of MSP scores allow an analysis of the performance of this method as well as the statistical significance of alignments it generates. The basic algorithm is simple and robust; it can be implemented in a number of ways and applied in a variety of contexts including straightforward DNA and protein sequence database searches, motif searches, gene identification searches, and in the analysis of multiple regions of similarity in long DNA sequences. In addition to its flexibility and tractability to mathematical analysis, BLAST is an order of magnitude faster than existing sequence comparison tools of comparable sensitivity.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found

              Improved protein structure prediction using potentials from deep learning

              Protein structure prediction can be used to determine the three-dimensional shape of a protein from its amino acid sequence1. This problem is of fundamental importance as the structure of a protein largely determines its function2; however, protein structures can be difficult to determine experimentally. Considerable progress has recently been made by leveraging genetic information. It is possible to infer which amino acid residues are in contact by analysing covariation in homologous sequences, which aids in the prediction of protein structures3. Here we show that we can train a neural network to make accurate predictions of the distances between pairs of residues, which convey more information about the structure than contact predictions. Using this information, we construct a potential of mean force4 that can accurately describe the shape of a protein. We find that the resulting potential can be optimized by a simple gradient descent algorithm to generate structures without complex sampling procedures. The resulting system, named AlphaFold, achieves high accuracy, even for sequences with fewer homologous sequences. In the recent Critical Assessment of Protein Structure Prediction5 (CASP13)-a blind assessment of the state of the field-AlphaFold created high-accuracy structures (with template modelling (TM) scores6 of 0.7 or higher) for 24 out of 43 free modelling domains, whereas the next best method, which used sampling and contact information, achieved such accuracy for only 14 out of 43 domains. AlphaFold represents a considerable advance in protein-structure prediction. We expect this increased accuracy to enable insights into the function and malfunction of proteins, especially in cases for which no structures for homologous proteins have been experimentally determined7.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS Comput Biol
                PLoS Comput Biol
                plos
                PLOS Computational Biology
                Public Library of Science (San Francisco, CA USA )
                1553-734X
                1553-7358
                15 March 2024
                March 2024
                : 20
                : 3
                : e1011920
                Affiliations
                [1 ] Centre for Microbiome Research, School of Biomedical Sciences, Queensland University of Technology, Translational Research Institute, Woolloongabba, Queensland, Australia
                [2 ] Centre for Data Science, Queensland University of Technology, Brisbane, Australia
                University of Virginia, UNITED STATES
                Author notes

                The authors have declared that no competing interests exist.

                Author information
                https://orcid.org/0000-0002-9280-7885
                Article
                PCOMPBIOL-D-23-01655
                10.1371/journal.pcbi.1011920
                10942072
                38489255
                1ca98737-98c7-4646-b673-a79dd7a519b0
                © 2024 Luis Pedro Coelho

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                Page count
                Figures: 0, Tables: 0, Pages: 7
                Funding
                Funded by: funder-id http://dx.doi.org/10.13039/501100000923, Australian Research Council;
                Award ID: FT230100724
                Award Recipient :
                This work was supported by the Australian Research Council (grant FT230100724). The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Perspective
                Computer and Information Sciences
                Software Engineering
                Computer Software
                Engineering and Technology
                Software Engineering
                Computer Software
                Research and Analysis Methods
                Research Assessment
                Reproducibility
                Computer and Information Sciences
                Software Engineering
                Computer Software
                Open Source Software
                Engineering and Technology
                Software Engineering
                Computer Software
                Open Source Software
                Science Policy
                Open Science
                Open Source Software
                Computer and Information Sciences
                Software Engineering
                Software Tools
                Engineering and Technology
                Software Engineering
                Software Tools
                Research and Analysis Methods
                Database and Informatics Methods
                Bioinformatics
                Biology and Life Sciences
                Computational Biology
                People and Places
                Population Groupings
                Educational Status
                Trainees
                Computer and Information Sciences
                Software Engineering
                Software Development
                Engineering and Technology
                Software Engineering
                Software Development

                Quantitative & Systems biology
                Quantitative & Systems biology

                Comments

                Comment on this article