294
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: not found
      • Article: not found

      Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Knowing the sequence specificities of DNA- and RNA-binding proteins is essential for developing models of the regulatory processes in biological systems and for identifying causal disease variants. Here we show that sequence specificities can be ascertained from experimental data with 'deep learning' techniques, which offer a scalable, flexible and unified computational approach for pattern discovery. Using a diverse array of experimental data and evaluation metrics, we find that deep learning outperforms other state-of-the-art methods, even when training on in vitro data and testing on in vivo data. We call this approach DeepBind and have built a stand-alone software tool that is fully automatic and handles millions of sequences per experiment. Specificities determined by DeepBind are readily visualized as a weighted ensemble of position weight matrices or as a 'mutation map' that indicates how variations affect binding within a specific sequence.

          Related collections

          Most cited references25

          • Record: found
          • Abstract: found
          • Article: not found

          Design and analysis of ChIP-seq experiments for DNA-binding proteins

          Recent progress in massively parallel sequencing platforms has allowed for genome-wide measurements of DNA-associated proteins using a combination of chromatin immunoprecipitation and sequencing (ChIP-seq). While a variety of methods exist for analysis of the established microarray alternative (ChIP-chip), few approaches have been described for processing ChIP-seq data. To fill this gap, we propose an analysis pipeline specifically designed to detect protein binding positions with high accuracy. Using three separate datasets, we illustrate new methods for improving tag alignment and correcting for background signals. We also compare sensitivity and spatial precision of several novel and previously described binding detection algorithms. Finally, we analyze the relationship between the depth of sequencing and characteristics of the detected binding positions, and provide a method for estimating the sequencing depth necessary for a desired coverage of protein binding sites.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Origins of specificity in protein-DNA recognition.

            Specific interactions between proteins and DNA are fundamental to many biological processes. In this review, we provide a revised view of protein-DNA interactions that emphasizes the importance of the three-dimensional structures of both macromolecules. We divide protein-DNA interactions into two categories: those when the protein recognizes the unique chemical signatures of the DNA bases (base readout) and those when the protein recognizes a sequence-dependent DNA shape (shape readout). We further divide base readout into those interactions that occur in the major groove from those that occur in the minor groove. Analogously, the readout of the DNA shape is subdivided into global shape recognition (for example, when the DNA helix exhibits an overall bend) and local shape recognition (for example, when a base pair step is kinked or a region of the minor groove is narrow). Based on the >1500 structures of protein-DNA complexes now available in the Protein Data Bank, we argue that individual DNA-binding proteins combine multiple readout mechanisms to achieve DNA-binding specificity. Specificity that distinguishes between families frequently involves base readout in the major groove, whereas shape readout is often exploited for higher resolution specificity, to distinguish between members within the same DNA-binding protein family.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Genome-wide analysis of PTB-RNA interactions reveals a strategy used by the general splicing repressor to modulate exon inclusion or skipping.

              Recent transcriptome analysis indicates that > 90% of human genes undergo alternative splicing, underscoring the contribution of differential RNA processing to diverse proteomes in higher eukaryotic cells. The polypyrimidine tract-binding protein PTB is a well-characterized splicing repressor, but PTB knockdown causes both exon inclusion and skipping. Genome-wide mapping of PTB-RNA interactions and construction of a functional RNA map now reveal that dominant PTB binding near a competing constitutive splice site generally induces exon inclusion, whereas prevalent binding close to an alternative site often causes exon skipping. This positional effect was further demonstrated by disrupting or creating a PTB-binding site on minigene constructs and testing their responses to PTB knockdown or overexpression. These findings suggest a mechanism for PTB to modulate splice site competition to produce opposite functional consequences, which may be generally applicable to RNA-binding splicing factors to positively or negatively regulate alternative splicing in mammalian cells. 2009 Elsevier Inc.
                Bookmark

                Author and article information

                Journal
                Nature Biotechnology
                Nat Biotechnol
                Springer Science and Business Media LLC
                1087-0156
                1546-1696
                August 2015
                July 27 2015
                August 2015
                : 33
                : 8
                : 831-838
                Article
                10.1038/nbt.3300
                26213851
                a9b6e6d3-d994-4f9f-9455-902910fef76b
                © 2015

                http://www.springer.com/tdm

                History

                Comments

                Comment on this article

                scite_
                0
                0
                0
                0
                Smart Citations
                0
                0
                0
                0
                Citing PublicationsSupportingMentioningContrasting
                View Citations

                See how this article has been cited at scite.ai

                scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

                Similar content933

                Cited by970

                Most referenced authors1,285