110
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      RegulonDB version 9.0: high-level integration of gene regulation, coexpression, motif clustering and beyond

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          RegulonDB ( http://regulondb.ccg.unam.mx) is one of the most useful and important resources on bacterial gene regulation,as it integrates the scattered scientific knowledge of the best-characterized organism, Escherichia coli K-12, in a database that organizes large amounts of data. Its electronic format enables researchers to compare their results with the legacy of previous knowledge and supports bioinformatics tools and model building. Here, we summarize our progress with RegulonDB since our last Nucleic Acids Research publication describing RegulonDB, in 2013. In addition to maintaining curation up-to-date, we report a collection of 232 interactions with small RNAs affecting 192 genes, and the complete repertoire of 189 Elementary Genetic Sensory-Response units (GENSOR units), integrating the signal, regulatory interactions, and metabolic pathways they govern. These additions represent major progress to a higher level of understanding of regulated processes. We have updated the computationally predicted transcription factors, which total 304 (184 with experimental evidence and 120 from computational predictions); we updated our position-weight matrices and have included tools for clustering them in evolutionary families. We describe our semiautomatic strategy to accelerate curation, including datasets from high-throughput experiments, a novel coexpression distance to search for ‘neighborhood’ genes to known operons and regulons, and computational developments.

          Related collections

          Most cited references31

          • Record: found
          • Abstract: found
          • Article: not found

          Network motifs in the transcriptional regulation network of Escherichia coli

          Little is known about the design principles of transcriptional regulation networks that control gene expression in cells. Recent advances in data collection and analysis, however, are generating unprecedented amounts of information about gene regulation networks. To understand these complex wiring diagrams, we sought to break down such networks into basic building blocks. We generalize the notion of motifs, widely used for sequence analysis, to the level of networks. We define 'network motifs' as patterns of interconnections that recur in many different parts of a network at frequencies much higher than those found in randomized networks. We applied new algorithms for systematically detecting network motifs to one of the best-characterized regulation networks, that of direct transcriptional interactions in Escherichia coli. We find that much of the network is composed of repeated appearances of three highly significant motifs. Each network motif has a specific function in determining gene expression, such as generating temporal expression programs and governing the responses to fluctuating external signals. The motif structure also allows an easily interpretable view of the entire known transcriptional network of the organism. This approach may help define the basic computational elements of other biological networks.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            CDD: a conserved domain database for interactive domain family analysis

            The conserved domain database (CDD) is part of NCBI's Entrez database system and serves as a primary resource for the annotation of conserved domain footprints on protein sequences in Entrez. Entrez's global query interface can be accessed at and will search CDD and many other databases. Domain annotation for proteins in Entrez has been pre-computed and is readily available in the form of ‘Conserved Domain’ links. Novel protein sequences can be scanned against CDD using the CD-Search service; this service searches databases of CDD-derived profile models with protein sequence queries using BLAST heuristics, at . Protein query sequences submitted to NCBI's protein BLAST search service are scanned for conserved domain signatures by default. The CDD collection contains models imported from Pfam, SMART and COG, as well as domain models curated at NCBI. NCBI curated models are organized into hierarchies of domains related by common descent. Here we report on the status of the curation effort and present a novel helper application, CDTree, which enables users of the CDD resource to examine curated hierarchies. More importantly, CDD and CDTree used in concert, serve as a powerful tool in protein classification, as they allow users to analyze protein sequences in the context of domain family hierarchies.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              RegulonDB v8.0: omics data sets, evolutionary conservation, regulatory phrases, cross-validated gold standards and more

              This article summarizes our progress with RegulonDB (http://regulondb.ccg.unam.mx/) during the past 2 years. We have kept up-to-date the knowledge from the published literature regarding transcriptional regulation in Escherichia coli K-12. We have maintained and expanded our curation efforts to improve the breadth and quality of the encoded experimental knowledge, and we have implemented criteria for the quality of our computational predictions. Regulatory phrases now provide high-level descriptions of regulatory regions. We expanded the assignment of quality to various sources of evidence, particularly for knowledge generated through high-throughput (HT) technology. Based on our analysis of most relevant methods, we defined rules for determining the quality of evidence when multiple independent sources support an entry. With this latest release of RegulonDB, we present a new highly reliable larger collection of transcription start sites, a result of our experimental HT genome-wide efforts. These improvements, together with several novel enhancements (the tracks display, uploading format and curational guidelines), address the challenges of incorporating HT-generated knowledge into RegulonDB. Information on the evolutionary conservation of regulatory elements is also available now. Altogether, RegulonDB version 8.0 is a much better home for integrating knowledge on gene regulation from the sources of information currently available.
                Bookmark

                Author and article information

                Journal
                Nucleic Acids Res
                Nucleic Acids Res
                nar
                nar
                Nucleic Acids Research
                Oxford University Press
                0305-1048
                1362-4962
                04 January 2016
                02 November 2015
                02 November 2015
                : 44
                : Database issue , Database issue
                : D133-D143
                Affiliations
                [1 ]Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, A.P. 565-A, Cuernavaca, Morelos 62100, Mexico
                [2 ]UMR_S 1090 TAGC, INSERM, Marseille, 13000 France
                [3 ]Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Campus Juriquilla, Boulevard Juriquilla 3001, Juriquilla 76230, Santiago de Querétaro, QRO, Mexico
                [4 ]Departamento de Microbiologia Molecular, IBT, Universidad Nacional Autónoma de México, Cuernavaca, Morelos 62100, Mexico
                [5 ]Institute of Computational Linguistics, University of Zurich, Binzmühlestrasse 14, CH-8050 Zurich, Switzerland
                Author notes
                [* ]To whom correspondence should be addressed. Tel: +52-777-3132063; Fax: +52-777-317-5581; Email: collado@ 123456ccg.unam.mx
                []These authors contributed equally to the paper as first authors.
                Article
                10.1093/nar/gkv1156
                4702833
                26527724
                260b5644-9acb-4d56-82c6-f60ab8c2d147
                © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 19 October 2015
                : 17 October 2015
                : 15 September 2015
                Page count
                Pages: 11
                Categories
                Database Issue
                Custom metadata
                04 January 2016

                Genetics
                Genetics

                Comments

                Comment on this article