91
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      The STRING database in 2021: customizable protein–protein networks, and functional characterization of user-uploaded gene/measurement sets

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Cellular life depends on a complex web of functional associations between biomolecules. Among these associations, protein–protein interactions are particularly important due to their versatility, specificity and adaptability. The STRING database aims to integrate all known and predicted associations between proteins, including both physical interactions as well as functional associations. To achieve this, STRING collects and scores evidence from a number of sources: (i) automated text mining of the scientific literature, (ii) databases of interaction experiments and annotated complexes/pathways, (iii) computational interaction predictions from co-expression and from conserved genomic context and (iv) systematic transfers of interaction evidence from one organism to another. STRING aims for wide coverage; the upcoming version 11.5 of the resource will contain more than 14 000 organisms. In this update paper, we describe changes to the text-mining system, a new scoring-mode for physical interactions, as well as extensive user interface features for customizing, extending and sharing protein networks. In addition, we describe how to query STRING with genome-wide, experimental data, including the automated detection of enriched functionalities and potential biases in the user's query data. The STRING resource is available online, at https://string-db.org/.

          Related collections

          Most cited references60

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          UniProt: a worldwide hub of protein knowledge

          (2018)
          Abstract The UniProt Knowledgebase is a collection of sequences and annotations for over 120 million proteins across all branches of life. Detailed annotations extracted from the literature by expert curators have been collected for over half a million of these proteins. These annotations are supplemented by annotations provided by rule based automated systems, and those imported from other resources. In this article we describe significant updates that we have made over the last 2 years to the resource. We have greatly expanded the number of Reference Proteomes that we provide and in particular we have focussed on improving the number of viral Reference Proteomes. The UniProt website has been augmented with new data visualizations for the subcellular localization of proteins as well as their structure and interactions. UniProt resources are available under a CC-BY (4.0) license via the web at https://www.uniprot.org/.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            STRING v10: protein–protein interaction networks, integrated over the tree of life

            The many functional partnerships and interactions that occur between proteins are at the core of cellular processing and their systematic characterization helps to provide context in molecular systems biology. However, known and predicted interactions are scattered over multiple resources, and the available data exhibit notable differences in terms of quality and completeness. The STRING database (http://string-db.org) aims to provide a critical assessment and integration of protein–protein interactions, including direct (physical) as well as indirect (functional) associations. The new version 10.0 of STRING covers more than 2000 organisms, which has necessitated novel, scalable algorithms for transferring interaction information between organisms. For this purpose, we have introduced hierarchical and self-consistent orthology annotations for all interacting proteins, grouping the proteins into families at various levels of phylogenetic resolution. Further improvements in version 10.0 include a completely redesigned prediction pipeline for inferring protein–protein associations from co-expression data, an API interface for the R computing environment and improved statistical analysis for enrichment tests in user-provided networks.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              KEGG as a reference resource for gene and protein annotation

              KEGG (http://www.kegg.jp/ or http://www.genome.jp/kegg/) is an integrated database resource for biological interpretation of genome sequences and other high-throughput data. Molecular functions of genes and proteins are associated with ortholog groups and stored in the KEGG Orthology (KO) database. The KEGG pathway maps, BRITE hierarchies and KEGG modules are developed as networks of KO nodes, representing high-level functions of the cell and the organism. Currently, more than 4000 complete genomes are annotated with KOs in the KEGG GENES database, which can be used as a reference data set for KO assignment and subsequent reconstruction of KEGG pathways and other molecular networks. As an annotation resource, the following improvements have been made. First, each KO record is re-examined and associated with protein sequence data used in experiments of functional characterization. Second, the GENES database now includes viruses, plasmids, and the addendum category for functionally characterized proteins that are not represented in complete genomes. Third, new automatic annotation servers, BlastKOALA and GhostKOALA, are made available utilizing the non-redundant pangenome data set generated from the GENES database. As a resource for translational bioinformatics, various data sets are created for antimicrobial resistance and drug interaction networks.
                Bookmark

                Author and article information

                Contributors
                Journal
                Nucleic Acids Res
                Nucleic Acids Res
                nar
                Nucleic Acids Research
                Oxford University Press
                0305-1048
                1362-4962
                08 January 2021
                25 November 2020
                25 November 2020
                : 49
                : D1
                : D605-D612
                Affiliations
                Department of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich , 8057 Zurich, Switzerland
                Department of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich , 8057 Zurich, Switzerland
                Novo Nordisk Foundation Center for Protein Research, University of Copenhagen , 2200 Copenhagen N, Denmark
                Department of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich , 8057 Zurich, Switzerland
                Novo Nordisk Foundation Center for Protein Research, University of Copenhagen , 2200 Copenhagen N, Denmark
                TurkuNLP Group, Department of Future Technologies, University of Turku , 20014 Turun Yliopisto, Finland
                Novo Nordisk Foundation Center for Protein Research, University of Copenhagen , 2200 Copenhagen N, Denmark
                Novo Nordisk Foundation Center for Protein Research, University of Copenhagen , 2200 Copenhagen N, Denmark
                Department of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich , 8057 Zurich, Switzerland
                Structural and Computational Biology Unit, European Molecular Biology Laboratory , 69117 Heidelberg, Germany
                Molecular Medicine Partnership Unit, University of Heidelberg and European Molecular Biology Laboratory , 69117 Heidelberg, Germany
                Max Delbrück Centre for Molecular Medicine , 13125 Berlin, Germany
                Department of Bioinformatics, Biozentrum, University of Würzburg , 97074 Würzburg, Germany
                Novo Nordisk Foundation Center for Protein Research, University of Copenhagen , 2200 Copenhagen N, Denmark
                Department of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich , 8057 Zurich, Switzerland
                Author notes
                To whom correspondence should be addressed. Tel: +41 44 6353147; Fax: +41 44 6356864; Email: mering@ 123456imls.uzh.ch

                The authors wish it to be known that, in their opinion, the first two authors should be regarded as Joint First Authors.

                Author information
                http://orcid.org/0000-0002-8965-0848
                http://orcid.org/0000-0002-8806-6850
                http://orcid.org/0000-0001-7734-9102
                Article
                gkaa1074
                10.1093/nar/gkaa1074
                7779004
                33237311
                61c3fc17-1e9d-4c01-896a-fe9f1ca5bf20
                © The Author(s) 2020. Published by Oxford University Press on behalf of Nucleic Acids Research.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 23 November 2020
                : 20 October 2020
                : 14 September 2020
                Page count
                Pages: 8
                Funding
                Funded by: Novo Nordisk Foundation, DOI 10.13039/501100009708;
                Award ID: NNF14CC0001
                Funded by: Academy of Finland, DOI 10.13039/501100002341;
                Award ID: 332844
                Categories
                AcademicSubjects/SCI00010
                Database Issue

                Genetics
                Genetics

                Comments

                Comment on this article