2
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      When will RNA get its AlphaFold moment?

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The protein structure prediction problem has been solved for many types of proteins by AlphaFold. Recently, there has been considerable excitement to build off the success of AlphaFold and predict the 3D structures of RNAs. RNA prediction methods use a variety of techniques, from physics-based to machine learning approaches. We believe that there are challenges preventing the successful development of deep learning-based methods like AlphaFold for RNA in the short term. Broadly speaking, the challenges are the limited number of structures and alignments making data-hungry deep learning methods unlikely to succeed. Additionally, there are several issues with the existing structure and sequence data, as they are often of insufficient quality, highly biased and missing key information. Here, we discuss these challenges in detail and suggest some steps to remedy the situation. We believe that it is possible to create an accurate RNA structure prediction method, but it will require solving several data quality and volume issues, usage of data beyond simple sequence alignments, or the development of new less data-hungry machine learning methods.

          Graphical Abstract

          Graphical Abstract

          Related collections

          Most cited references104

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Highly accurate protein structure prediction with AlphaFold

          Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous experimental effort 1 – 4 , the structures of around 100,000 unique proteins have been determined 5 , but this represents a small fraction of the billions of known protein sequences 6 , 7 . Structural coverage is bottlenecked by the months to years of painstaking effort required to determine a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence—the structure prediction component of the ‘protein folding problem’ 8 —has been an important open research problem for more than 50 years 9 . Despite recent progress 10 – 14 , existing methods fall far short of atomic accuracy, especially when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14) 15 , demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm. AlphaFold predicts protein structures with an accuracy competitive with experimental structures in the majority of cases using a novel deep learning architecture.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            ColabFold: making protein folding accessible to all

            ColabFold offers accelerated prediction of protein structures and complexes by combining the fast homology search of MMseqs2 with AlphaFold2 or RoseTTAFold. ColabFold’s 40−60-fold faster search and optimized model utilization enables prediction of close to 1,000 structures per day on a server with one graphics processing unit. Coupled with Google Colaboratory, ColabFold becomes a free and accessible platform for protein folding. ColabFold is open-source software available at https://github.com/sokrypton/ColabFold and its novel environmental databases are available at https://colabfold.mmseqs.com . ColabFold is a free and accessible platform for protein folding that provides accelerated prediction of protein structures and complexes using AlphaFold2 or RoseTTAFold.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              miRBase: from microRNA sequences to function

              Abstract miRBase catalogs, names and distributes microRNA gene sequences. The latest release of miRBase (v22) contains microRNA sequences from 271 organisms: 38 589 hairpin precursors and 48 860 mature microRNAs. We describe improvements to the database and website to provide more information about the quality of microRNA gene annotations, and the cellular functions of their products. We have collected 1493 small RNA deep sequencing datasets and mapped a total of 5.5 billion reads to microRNA sequences. The read mapping patterns provide strong support for the validity of between 20% and 65% of microRNA annotations in different well-studied animal genomes, and evidence for the removal of >200 sequences from the database. To improve the availability of microRNA functional information, we are disseminating Gene Ontology terms annotated against miRBase sequences. We have also used a text-mining approach to search for microRNA gene names in the full-text of open access articles. Over 500 000 sentences from 18 542 papers contain microRNA names. We score these sentences for functional information and link them with 12 519 microRNA entries. The sentences themselves, and word clouds built from them, provide effective summaries of the functional information about specific microRNAs. miRBase is publicly and freely available at http://mirbase.org/.
                Bookmark

                Author and article information

                Contributors
                Journal
                Nucleic Acids Res
                Nucleic Acids Res
                nar
                Nucleic Acids Research
                Oxford University Press
                0305-1048
                1362-4962
                13 October 2023
                13 September 2023
                13 September 2023
                : 51
                : 18
                : 9522-9532
                Affiliations
                Institute of Biotechnology of the Czech Academy of Sciences , Prumyslova 595, CZ-252 50 Vestec, Czech Republic
                European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI) , Wellcome Genome Campus, Hinxton, CB10 1SD, UK
                European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI) , Wellcome Genome Campus, Hinxton, CB10 1SD, UK
                Institute of Biotechnology of the Czech Academy of Sciences , Prumyslova 595, CZ-252 50 Vestec, Czech Republic
                Institute of Computing Science and European Centre for Bioinformatics and Genomics, Poznan University of Technology , Piotrowo 2, 60-965 Poznan, Poland
                Institute of Computing Science and European Centre for Bioinformatics and Genomics, Poznan University of Technology , Piotrowo 2, 60-965 Poznan, Poland
                Institute of Bioorganic Chemistry, Polish Academy of Sciences , Noskowskiego 12/14, 61-704 Poznan, Poland
                Author notes
                To whom correspondence should be addressed. Tel: +48 616652999; Fax: +44 1223494100; Email: agb@ 123456ebi.ac.uk
                Author information
                https://orcid.org/0000-0001-7855-3690
                https://orcid.org/0000-0002-6497-2883
                https://orcid.org/0000-0002-6982-4660
                https://orcid.org/0000-0002-1969-9304
                https://orcid.org/0000-0003-4103-9238
                https://orcid.org/0000-0002-8724-7908
                Article
                gkad726
                10.1093/nar/gkad726
                10570031
                37702120
                b14d84bf-cb21-481e-a2ca-d4c5e4ec4a3b
                © The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 22 August 2023
                : 13 August 2023
                : 10 May 2023
                Page count
                Pages: 11
                Funding
                Funded by: National Science Centre Poland;
                Award ID: 2019/35/B/ST6/03074
                Funded by: European Molecular Biology Laboratory, DOI 10.13039/100013060;
                Funded by: Politechnika Poznańska, DOI 10.13039/501100004239;
                Funded by: ELIXIR CZ;
                Award ID: LM2023055
                Funded by: Akademie Věd České Republiky, DOI 10.13039/501100004240;
                Award ID: RVO 86652036
                Categories
                AcademicSubjects/SCI00010
                Critical Reviews and Perspectives

                Genetics
                Genetics

                Comments

                Comment on this article