3
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Transfer learning via multi-scale convolutional neural layers for human–virus protein–protein interaction prediction

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Motivation

          To complement experimental efforts, machine learning-based computational methods are playing an increasingly important role to predict human–virus protein–protein interactions (PPIs). Furthermore, transfer learning can effectively apply prior knowledge obtained from a large source dataset/task to a small target dataset/task, improving prediction performance.

          Results

          To predict interactions between human and viral proteins, we combine evolutionary sequence profile features with a Siamese convolutional neural network (CNN) architecture and a multi-layer perceptron. Our architecture outperforms various feature encodings-based machine learning and state-of-the-art prediction methods. As our main contribution, we introduce two transfer learning methods (i.e. ‘frozen’ type and ‘fine-tuning’ type) that reliably predict interactions in a target human–virus domain based on training in a source human–virus domain, by retraining CNN layers. Finally, we utilize the ‘frozen’ type transfer learning approach to predict human–SARS-CoV-2 PPIs, indicating that our predictions are topologically and functionally similar to experimentally known interactions.

          Availability and implementation: The source codes and datasets are available at https://github.com/XiaodiYangCAU/TransPPI/.

          Supplementary information

          Supplementary data are available at Bioinformatics online.

          Related collections

          Most cited references48

          • Record: found
          • Abstract: found
          • Article: not found

          A SARS-CoV-2 Protein Interaction Map Reveals Targets for Drug-Repurposing

          SUMMARY The novel coronavirus SARS-CoV-2, the causative agent of COVID-19 respiratory disease, has infected over 2.3 million people, killed over 160,000, and caused worldwide social and economic disruption 1,2 . There are currently no antiviral drugs with proven clinical efficacy, nor are there vaccines for its prevention, and these efforts are hampered by limited knowledge of the molecular details of SARS-CoV-2 infection. To address this, we cloned, tagged and expressed 26 of the 29 SARS-CoV-2 proteins in human cells and identified the human proteins physically associated with each using affinity-purification mass spectrometry (AP-MS), identifying 332 high-confidence SARS-CoV-2-human protein-protein interactions (PPIs). Among these, we identify 66 druggable human proteins or host factors targeted by 69 compounds (29 FDA-approved drugs, 12 drugs in clinical trials, and 28 preclinical compounds). Screening a subset of these in multiple viral assays identified two sets of pharmacological agents that displayed antiviral activity: inhibitors of mRNA translation and predicted regulators of the Sigma1 and Sigma2 receptors. Further studies of these host factor targeting agents, including their combination with drugs that directly target viral enzymes, could lead to a therapeutic regimen to treat COVID-19.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            An automated method for finding molecular complexes in large protein interaction networks

            Background Recent advances in proteomics technologies such as two-hybrid, phage display and mass spectrometry have enabled us to create a detailed map of biomolecular interaction networks. Initial mapping efforts have already produced a wealth of data. As the size of the interaction set increases, databases and computational methods will be required to store, visualize and analyze the information in order to effectively aid in knowledge discovery. Results This paper describes a novel graph theoretic clustering algorithm, "Molecular Complex Detection" (MCODE), that detects densely connected regions in large protein-protein interaction networks that may represent molecular complexes. The method is based on vertex weighting by local neighborhood density and outward traversal from a locally dense seed protein to isolate the dense regions according to given parameters. The algorithm has the advantage over other graph clustering methods of having a directed mode that allows fine-tuning of clusters of interest without considering the rest of the network and allows examination of cluster interconnectivity, which is relevant for protein networks. Protein interaction and complex information from the yeast Saccharomyces cerevisiae was used for evaluation. Conclusion Dense regions of protein interaction networks can be found, based solely on connectivity data, many of which correspond to known protein complexes. The algorithm is not affected by a known high rate of false positives in data from high-throughput interaction techniques. The program is available from .
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              UniProt: the universal protein knowledgebase

              (2016)
              The UniProt knowledgebase is a large resource of protein sequences and associated detailed annotation. The database contains over 60 million sequences, of which over half a million sequences have been curated by experts who critically review experimental and predicted data for each protein. The remainder are automatically annotated based on rule systems that rely on the expert curated knowledge. Since our last update in 2014, we have more than doubled the number of reference proteomes to 5631, giving a greater coverage of taxonomic diversity. We implemented a pipeline to remove redundant highly similar proteomes that were causing excessive redundancy in UniProt. The initial run of this pipeline reduced the number of sequences in UniProt by 47 million. For our users interested in the accessory proteomes, we have made available sets of pan proteome sequences that cover the diversity of sequences for each species that is found in its strains and sub-strains. To help interpretation of genomic variants, we provide tracks of detailed protein information for the major genome browsers. We provide a SPARQL endpoint that allows complex queries of the more than 22 billion triples of data in UniProt (http://sparql.uniprot.org/). UniProt resources can be accessed via the website at http://www.uniprot.org/.
                Bookmark

                Author and article information

                Contributors
                Role: Associate Editor
                Journal
                Bioinformatics
                Bioinformatics
                bioinformatics
                Bioinformatics
                Oxford University Press
                1367-4803
                1367-4811
                17 July 2021
                17 July 2021
                : btab533
                Affiliations
                [1 ]State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University , Beijing 100193, China
                [2 ]State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University , Beijing 100193, China
                [3 ]Department of Computer Science, University of Miami , Miami, FL 33146, USA
                [4 ]Department of Biology, University of Miami , Miami, FL 33146, USA
                [5 ]Sylvester Comprehensive Cancer Center, University of Miami , Miami, FL 33136, USA
                Author notes
                To whom correspondence should be addressed. wuchtys@ 123456cs.miami.edu or zidingzhang@ 123456cau.edu.cn
                Author information
                https://orcid.org/0000-0002-3229-5865
                https://orcid.org/0000-0001-5631-3549
                https://orcid.org/0000-0001-7576-2198
                https://orcid.org/0000-0001-8916-6522
                https://orcid.org/0000-0002-9296-571X
                Article
                btab533
                10.1093/bioinformatics/btab533
                8406877
                34273146
                cd1bc004-c7b1-4a96-b76f-3b72e046d9f6
                © The Author(s) 2021. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com

                This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model ( https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

                This article is made available via the PMC Open Access Subset for unrestricted re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the COVID-19 pandemic or until permissions are revoked in writing. Upon expiration of these permissions, PMC is granted a perpetual license to make this article available via PMC and Europe PMC, consistent with existing copyright protections.

                History
                : 23 February 2021
                : 03 June 2021
                : 12 July 2021
                : 16 July 2021
                Page count
                Pages: 8
                Funding
                Funded by: National Key Research and Development Program of China, DOI 10.13039/501100012166;
                Award ID: 2017YFC1200205
                Award ID: 2017YFD0500404
                Categories
                Original Paper
                AcademicSubjects/SCI01060
                Custom metadata
                corrected-proof
                PAP

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article