36
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A Probabilistic Model of RNA Conformational Space

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The increasing importance of non-coding RNA in biology and medicine has led to a growing interest in the problem of RNA 3-D structure prediction. As is the case for proteins, RNA 3-D structure prediction methods require two key ingredients: an accurate energy function and a conformational sampling procedure. Both are only partly solved problems. Here, we focus on the problem of conformational sampling. The current state of the art solution is based on fragment assembly methods, which construct plausible conformations by stringing together short fragments obtained from experimental structures. However, the discrete nature of the fragments necessitates the use of carefully tuned, unphysical energy functions, and their non-probabilistic nature impairs unbiased sampling. We offer a solution to the sampling problem that removes these important limitations: a probabilistic model of RNA structure that allows efficient sampling of RNA conformations in continuous space, and with associated probabilities. We show that the model captures several key features of RNA structure, such as its rotameric nature and the distribution of the helix lengths. Furthermore, the model readily generates native-like 3-D conformations for 9 out of 10 test structures, solely using coarse-grained base-pairing information. In conclusion, the method provides a theoretical and practical solution for a major bottleneck on the way to routine prediction and simulation of RNA structure and dynamics in atomic detail.

          Author Summary

          The importance of RNA in biology and medicine has increased immensely over the last several years, due to the discovery of a wide range of important biological processes that are under the guidance of non-coding RNA. As is the case with proteins, the function of an RNA molecule is encoded in its three-dimensional (3-D) structure, which in turn is determined by the molecule's sequence. Therefore, interest in the computational prediction of the 3-D structure of RNA from sequence is great. One of the main bottlenecks in routine prediction and simulation of RNA structure and dynamics is sampling, the efficient generation of RNA-like conformations, ideally in a mathematically and physically sound way. Current methods require the use of unphysical energy functions to amend the shortcomings of the sampling procedure. We have developed a mathematical model that describes RNA's conformational space in atomic detail, without the shortcomings of other sampling methods. As an illustration of its potential, we describe a simple yet efficient method to sample conformations that are compatible with a given secondary structure. An implementation of the sampling method, called BARNACLE, is freely available.

          Related collections

          Most cited references20

          • Record: found
          • Abstract: found
          • Article: not found

          Non-coding RNA genes and the modern RNA world.

          S. Eddy (2001)
          Non-coding RNA (ncRNA) genes produce functional RNA molecules rather than encoding proteins. However, almost all means of gene identification assume that genes encode proteins, so even in the era of complete genome sequences, ncRNA genes have been effectively invisible. Recently, several different systematic screens have identified a surprisingly large number of new ncRNA genes. Non-coding RNAs seem to be particularly abundant in roles that require highly specific nucleic acid recognition without complex catalysis, such as in directing post-transcriptional regulation of gene expression or in guiding RNA modifications.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions.

            We explore the ability of a simple simulated annealing procedure to assemble native-like structures from fragments of unrelated protein structures with similar local sequences using Bayesian scoring functions. Environment and residue pair specific contributions to the scoring functions appear as the first two terms in a series expansion for the residue probability distributions in the protein database; the decoupling of the distance and environment dependencies of the distributions resolves the major problems with current database-derived scoring functions noted by Thomas and Dill. The simulated annealing procedure rapidly and frequently generates native-like structures for small helical proteins and better than random structures for small beta sheet containing proteins. Most of the simulated structures have native-like solvent accessibility and secondary structure patterns, and thus ensembles of these structures provide a particularly challenging set of decoys for evaluating scoring functions. We investigate the effects of multiple sequence information and different types of conformational constraints on the overall performance of the method, and the ability of a variety of recently developed scoring functions to recognize the native-like conformations in the ensembles of simulated structures.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              The MC-Fold and MC-Sym pipeline infers RNA structure from sequence data.

              The classical RNA secondary structure model considers A.U and G.C Watson-Crick as well as G.U wobble base pairs. Here we substitute it for a new one, in which sets of nucleotide cyclic motifs define RNA structures. This model allows us to unify all base pairing energetic contributions in an effective scoring function to tackle the problem of RNA folding. We show how pipelining two computer algorithms based on nucleotide cyclic motifs, MC-Fold and MC-Sym, reproduces a series of experimentally determined RNA three-dimensional structures from the sequence. This demonstrates how crucial the consideration of all base-pairing interactions is in filling the gap between sequence and structure. We use the pipeline to define rules of precursor microRNA folding in double helices, despite the presence of a number of presumed mismatches and bulges, and to propose a new model of the human immunodeficiency virus-1 -1 frame-shifting element.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS Comput Biol
                plos
                ploscomp
                PLoS Computational Biology
                Public Library of Science (San Francisco, USA )
                1553-734X
                1553-7358
                June 2009
                June 2009
                19 June 2009
                : 5
                : 6
                : e1000406
                Affiliations
                [1 ]The Bioinformatics Center, Department of Biology, University of Copenhagen, Copenhagen, Denmark
                [2 ]Department of Statistics, University of Leeds, Leeds, United Kingdom
                [3 ]DTU Elektro, Technical University of Denmark, Lyngby, Denmark
                Wellcome Trust Sanger Institute, United Kingdom
                Author notes

                Conceived and designed the experiments: JF IM MT TH. Performed the experiments: JF IM MT. Analyzed the data: JF IM MT TH. Contributed reagents/materials/analysis tools: KVM JFB. Wrote the paper: JF IM TH. Important parts of the research for this article was conducted as part of a joint master thesis project to which MT, IM and JF contributed equally.

                Article
                09-PLCB-RA-0161R2
                10.1371/journal.pcbi.1000406
                2691987
                19543381
                02abcff4-d497-49ed-8169-36a630442431
                Frellsen et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
                History
                : 17 February 2009
                : 6 May 2009
                Page count
                Pages: 11
                Categories
                Research Article
                Biophysics/RNA Structure
                Biophysics/Theory and Simulation
                Computational Biology/Macromolecular Structure Analysis
                Computational Biology/Molecular Dynamics
                Mathematics/Statistics
                Molecular Biology/Bioinformatics

                Quantitative & Systems biology
                Quantitative & Systems biology

                Comments

                Comment on this article