11
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Multifaceted protein–protein interaction prediction based on Siamese residual RCNN

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Motivation

          Sequence-based protein–protein interaction (PPI) prediction represents a fundamental computational biology problem. To address this problem, extensive research efforts have been made to extract predefined features from the sequences. Based on these features, statistical algorithms are learned to classify the PPIs. However, such explicit features are usually costly to extract, and typically have limited coverage on the PPI information.

          Results

          We present an end-to-end framework, PIPR ( Protein–Protein Interaction Prediction Based on Siamese Residual RCNN), for PPI predictions using only the protein sequences. PIPR incorporates a deep residual recurrent convolutional neural network in the Siamese architecture, which leverages both robust local features and contextualized information, which are significant for capturing the mutual influence of proteins sequences. PIPR relieves the data pre-processing efforts that are required by other systems, and generalizes well to different application scenarios. Experimental evaluations show that PIPR outperforms various state-of-the-art systems on the binary PPI prediction problem. Moreover, it shows a promising performance on more challenging problems of interaction type prediction and binding affinity estimation, where existing approaches fall short.

          Availability and implementation

          The implementation is available at https://github.com/muhaochen/seq_ppi.git.

          Supplementary information

          Supplementary data are available at Bioinformatics online.

          Related collections

          Most cited references30

          • Record: found
          • Abstract: not found
          • Article: not found

          Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry.

            The recent abundance of genome sequence data has brought an urgent need for systematic proteomics to decipher the encoded protein networks that dictate cellular function. To date, generation of large-scale protein-protein interaction maps has relied on the yeast two-hybrid system, which detects binary interactions through activation of reporter gene expression. With the advent of ultrasensitive mass spectrometric protein identification methods, it is feasible to identify directly protein complexes on a proteome-wide scale. Here we report, using the budding yeast Saccharomyces cerevisiae as a test case, an example of this approach, which we term high-throughput mass spectrometric protein complex identification (HMS-PCI). Beginning with 10% of predicted yeast proteins as baits, we detected 3,617 associated proteins covering 25% of the yeast proteome. Numerous protein complexes were identified, including many new interactions in various signalling pathways and in the DNA damage response. Comparison of the HMS-PCI data set with interactions reported in the literature revealed an average threefold higher success rate in detection of known complexes compared with large-scale two-hybrid studies. Given the high degree of connectivity observed in this study, even partial HMS-PCI coverage of complex proteomes, including that of humans, should allow comprehensive identification of cellular networks.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              The generalisation of student's problems when several different population variances are involved.

              B L WELCH (1947)
                Bookmark

                Author and article information

                Journal
                Bioinformatics
                Bioinformatics
                bioinformatics
                Bioinformatics
                Oxford University Press
                1367-4803
                1367-4811
                July 2019
                05 July 2019
                05 July 2019
                : 35
                : 14
                : i305-i314
                Affiliations
                [1 ]Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, USA
                [2 ]Department of Bioengineering, University of California, Los Angeles, Los Angeles, CA, USA
                Author notes
                To whom correspondence should be addressed. muhaochen@ 123456ucla.edu

                The authors wish it to be known that, in their opinion, the first two authors should be regarded as Joint First Authors.

                Article
                btz328
                10.1093/bioinformatics/btz328
                6681469
                31510705
                82d6427a-4088-4db6-a2fa-c18ea8a6ecb5
                © The Author(s) 2019. Published by Oxford University Press.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

                History
                Page count
                Pages: 10
                Funding
                Funded by: National Institutes of Health 10.13039/100000002
                Award ID: R01GM115833
                Award ID: U54 GM114833
                Funded by: National Science Foundation 10.13039/100000001
                Award ID: DBI-1565137
                Award ID: DGE-1829071
                Categories
                Ismb/Eccb 2019 Conference Proceedings
                Macromolecular Sequence, Structure, and Function

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article