0
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Integration of text mining and biological network analysis: Identification of essential genes in sulfate-reducing bacteria

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The growth and survival of an organism in a particular environment is highly depends on the certain indispensable genes, termed as essential genes. Sulfate-reducing bacteria (SRB) are obligate anaerobes which thrives on sulfate reduction for its energy requirements. The present study used Oleidesulfovibrio alaskensis G20 (OA G20) as a model SRB to categorize the essential genes based on their key metabolic pathways. Herein, we reported a feedback loop framework for gene of interest discovery, from bio-problem to gene set of interest, leveraging expert annotation with computational prediction. Defined bio-problem was applied to retrieve the genes of SRB from literature databases (PubMed, and PubMed Central) and annotated them to the genome of OA G20. Retrieved gene list was further used to enrich protein–protein interaction and was corroborated to the pangenome analysis, to categorize the enriched gene sets and the respective pathways under essential and non-essential. Interestingly, the sat gene (dde_2265) from the sulfur metabolism was the bridging gene between all the enriched pathways. Gene clusters involved in essential pathways were linked with the genes from seleno-compound metabolism, amino acid metabolism, secondary metabolite synthesis, and cofactor biosynthesis. Furthermore, pangenome analysis demonstrated the gene distribution, where 69.83% of the 116 enriched genes were mapped under “persistent,” inferring the essentiality of these genes. Likewise, 21.55% of the enriched genes, which involves specially the formate dehydrogenases and metallic hydrogenases, appeared under “shell.” Our methodology suggested that semi-automated text mining and network analysis may play a crucial role in deciphering the previously unexplored genes and key mechanisms which can help to generate a baseline prior to perform any experimental studies.

          Related collections

          Most cited references112

          • Record: found
          • Abstract: found
          • Article: not found

          Cytoscape: a software environment for integrated models of biomolecular interaction networks.

          Cytoscape is an open source software project for integrating biomolecular interaction networks with high-throughput expression data and other molecular states into a unified conceptual framework. Although applicable to any system of molecular components and interactions, Cytoscape is most powerful when used in conjunction with large databases of protein-protein, protein-DNA, and genetic interactions that are increasingly available for humans and model organisms. Cytoscape's software Core provides basic functionality to layout and query the network; to visually integrate the network with expression profiles, phenotypes, and other molecular states; and to link the network to databases of functional annotations. The Core is extensible through a straightforward plug-in architecture, allowing rapid development of additional computational analyses and features. Several case studies of Cytoscape plug-ins are surveyed, including a search for interaction pathways correlating with changes in gene expression, a study of protein complexes involved in cellular recovery to DNA damage, inference of a combined physical/functional interaction network for Halobacterium, and an interface to detailed stochastic/kinetic gene regulatory models.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

            S Altschul (1997)
            The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSI-BLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              The STRING database in 2021: customizable protein–protein networks, and functional characterization of user-uploaded gene/measurement sets

              Abstract Cellular life depends on a complex web of functional associations between biomolecules. Among these associations, protein–protein interactions are particularly important due to their versatility, specificity and adaptability. The STRING database aims to integrate all known and predicted associations between proteins, including both physical interactions as well as functional associations. To achieve this, STRING collects and scores evidence from a number of sources: (i) automated text mining of the scientific literature, (ii) databases of interaction experiments and annotated complexes/pathways, (iii) computational interaction predictions from co-expression and from conserved genomic context and (iv) systematic transfers of interaction evidence from one organism to another. STRING aims for wide coverage; the upcoming version 11.5 of the resource will contain more than 14 000 organisms. In this update paper, we describe changes to the text-mining system, a new scoring-mode for physical interactions, as well as extensive user interface features for customizing, extending and sharing protein networks. In addition, we describe how to query STRING with genome-wide, experimental data, including the automated detection of enriched functionalities and potential biases in the user's query data. The STRING resource is available online, at https://string-db.org/.
                Bookmark

                Author and article information

                Contributors
                Journal
                Front Microbiol
                Front Microbiol
                Front. Microbiol.
                Frontiers in Microbiology
                Frontiers Media S.A.
                1664-302X
                13 April 2023
                2023
                : 14
                : 1086021
                Affiliations
                [1] 1Department of Chemical and Biological Engineering, South Dakota School of Mines and Technology , Rapid City, SD, United States
                [2] 2Data Driven Material Discovery Center for Bioengineering Innovation, South Dakota School of Mines and Technology , Rapid City, SD, United States
                [3] 32-Dimensional Materials for Biofilm Engineering, Science and Technology, South Dakota School of Mines and Technology , Rapid City, SD, United States
                [4] 4Department of Biomedical Engineering, University of South Dakota , Sioux Falls, SD, United States
                [5] 5BuG ReMeDEE Consortium, South Dakota School of Mines and Technology , Rapid City, SD, United States
                Author notes

                Edited by: George Tsiamis, University of Patras, Greece

                Reviewed by: Hongwei Liu, Sun Yat-sen University, Zhuhai Campus, China; Yuejun Wang, University of California, San Francisco, United States

                *Correspondence: Etienne Z. Gnimpieba, etienne.gnimpieba@ 123456usd.edu

                This article was submitted to Systems Microbiology, a section of the journal Frontiers in Microbiology

                Article
                10.3389/fmicb.2023.1086021
                10133479
                37125195
                2fbd5780-08d9-48a6-94f6-afdfd78cafa5
                Copyright © 2023 Saxena, Rauniyar, Thakur, Singh, Bomgni, Alaba, Tripathi, Gnimpieba, Lushbough and Sani.

                This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

                History
                : 31 October 2022
                : 23 March 2023
                Page count
                Figures: 8, Tables: 3, Equations: 0, References: 117, Pages: 17, Words: 13581
                Funding
                Funded by: National Science Foundation, doi 10.13039/501100008982;
                Award ID: #1736255
                Award ID: #1849206
                Award ID: #1920954
                Categories
                Microbiology
                Original Research

                Microbiology & Virology
                oleidesulfovibrio alaskensis g20,essential genes,pathways,semi-automated model,srb,text mining

                Comments

                Comment on this article