2
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Eukaryotic CD-NTase, STING, and viperin proteins evolved via domain shuffling, horizontal transfer, and ancient inheritance from prokaryotes

      research-article
      , * ,
      PLOS Biology
      Public Library of Science

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Animals use a variety of cell-autonomous innate immune proteins to detect viral infections and prevent replication. Recent studies have discovered that a subset of mammalian antiviral proteins have homology to antiphage defense proteins in bacteria, implying that there are aspects of innate immunity that are shared across the Tree of Life. While the majority of these studies have focused on characterizing the diversity and biochemical functions of the bacterial proteins, the evolutionary relationships between animal and bacterial proteins are less clear. This ambiguity is partly due to the long evolutionary distances separating animal and bacterial proteins, which obscures their relationships. Here, we tackle this problem for 3 innate immune families (CD-NTases [including cGAS], STINGs, and viperins) by deeply sampling protein diversity across eukaryotes. We find that viperins and OAS family CD-NTases are ancient immune proteins, likely inherited since the earliest eukaryotes first arose. In contrast, we find other immune proteins that were acquired via at least 4 independent events of horizontal gene transfer (HGT) from bacteria. Two of these events allowed algae to acquire new bacterial viperins, while 2 more HGT events gave rise to distinct superfamilies of eukaryotic CD-NTases: the cGLR superfamily (containing cGAS) that has since diversified via a series of animal-specific duplications and a previously undefined eSMODS superfamily, which more closely resembles bacterial CD-NTases. Finally, we found that cGAS and STING proteins have substantially different histories, with STING protein domains undergoing convergent domain shuffling in bacteria and eukaryotes. Overall, our findings paint a picture of eukaryotic innate immunity as highly dynamic, where eukaryotes build upon their ancient antiviral repertoires through the reuse of protein domains and by repeatedly sampling a rich reservoir of bacterial antiphage genes.

          Abstract

          How and when did our innate immune systems first evolve? This study analyses diverse eukaryotes, uncovering the evolutionary origins of three families of antiviral proteins: CD-NTases (including cGAS), STINGs and Viperins; each reveals a different story connecting animal and bacterial immunity.

          Related collections

          Most cited references95

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability

          We report a major update of the MAFFT multiple sequence alignment program. This version has several new features, including options for adding unaligned sequences into an existing alignment, adjustment of direction in nucleotide alignment, constrained alignment and parallel processing, which were implemented after the previous major update. This report shows actual examples to explain how these features work, alone and in combination. Some examples incorrectly aligned by MAFFT are also shown to clarify its limitations. We discuss how to avoid misalignments, and our ongoing efforts to overcome such limitations.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Highly accurate protein structure prediction with AlphaFold

            Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous experimental effort 1 – 4 , the structures of around 100,000 unique proteins have been determined 5 , but this represents a small fraction of the billions of known protein sequences 6 , 7 . Structural coverage is bottlenecked by the months to years of painstaking effort required to determine a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence—the structure prediction component of the ‘protein folding problem’ 8 —has been an important open research problem for more than 50 years 9 . Despite recent progress 10 – 14 , existing methods fall far short of atomic accuracy, especially when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14) 15 , demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm. AlphaFold predicts protein structures with an accuracy competitive with experimental structures in the majority of cases using a novel deep learning architecture.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              The Protein Data Bank.

              The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.
                Bookmark

                Author and article information

                Contributors
                Role: Data curationRole: Formal analysisRole: InvestigationRole: MethodologyRole: Project administrationRole: ValidationRole: VisualizationRole: Writing – original draftRole: Writing – review & editing
                Role: ConceptualizationRole: Funding acquisitionRole: InvestigationRole: Project administrationRole: SupervisionRole: ValidationRole: VisualizationRole: Writing – original draftRole: Writing – review & editing
                Role: Academic Editor
                Journal
                PLoS Biol
                PLoS Biol
                plos
                PLOS Biology
                Public Library of Science (San Francisco, CA USA )
                1544-9173
                1545-7885
                8 December 2023
                December 2023
                8 December 2023
                : 21
                : 12
                : e3002436
                Affiliations
                [001] University of Pittsburgh, Department of Biological Sciences, Pittsburgh, Pennsylvania, United States of America
                HHMI, Massachusetts Institute of Technology, UNITED STATES
                Author notes

                The authors have declared that no competing interests exist.

                Author information
                https://orcid.org/0000-0001-7883-8522
                Article
                PBIOLOGY-D-23-02924
                10.1371/journal.pbio.3002436
                10732462
                38064485
                4b159e39-8fba-423d-a80b-1e85159b4658
                © 2023 Culbertson, Levin

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 7 November 2023
                : 20 November 2023
                Page count
                Figures: 5, Tables: 0, Pages: 26
                Funding
                Funded by: University of Pittsburgh Center for Research Computing
                Award ID: RRID:SCR_022735
                Funded by: funder-id http://dx.doi.org/10.13039/100016958, Office of Research Infrastructure Programs, National Institutes of Health;
                Award ID: S10OD028483
                Funded by: Directorate for Biological Sciences, National Science Foundation
                Award ID: 2208971
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/100006492, Division of Intramural Research, National Institute of Allergy and Infectious Diseases;
                Award ID: R00AI139344
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/100000057, National Institute of General Medical Sciences;
                Award ID: R35GM150681
                Award Recipient :
                This research was supported in part by the University of Pittsburgh Center for Research Computing, RRID:SCR_022735, through the resources provided. Specifically, this work used the HTC cluster, which is supported by NIH award number S10OD028483. EMC was supported by NSF Postdoctoral fellowship 2208971 and TCL was supported by NIH R00AI139344 and R35GM150681. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Biology and Life Sciences
                Organisms
                Eukaryota
                Biology and Life Sciences
                Evolutionary Biology
                Evolutionary Systematics
                Phylogenetics
                Phylogenetic Analysis
                Biology and Life Sciences
                Taxonomy
                Evolutionary Systematics
                Phylogenetics
                Phylogenetic Analysis
                Computer and Information Sciences
                Data Management
                Taxonomy
                Evolutionary Systematics
                Phylogenetics
                Phylogenetic Analysis
                Biology and Life Sciences
                Biochemistry
                Proteins
                Protein Domains
                Physical sciences
                Mathematics
                Probability theory
                Markov models
                Hidden Markov models
                Research and Analysis Methods
                Database and Informatics Methods
                Bioinformatics
                Sequence Analysis
                Sequence Alignment
                Biology and Life Sciences
                Immunology
                Immune System Proteins
                Medicine and Health Sciences
                Immunology
                Immune System Proteins
                Biology and Life Sciences
                Biochemistry
                Proteins
                Immune System Proteins
                Biology and Life Sciences
                Evolutionary Biology
                Evolutionary Processes
                Horizontal Gene Transfer
                Biology and Life Sciences
                Genetics
                Gene Transfer
                Horizontal Gene Transfer
                Biology and Life Sciences
                Evolutionary Biology
                Evolutionary Immunology
                Custom metadata
                vor-update-to-uncorrected-proof
                2023-12-20
                All relevant data are within the paper and its Supporting Information files. Additional code used in the paper is available at https://github.com/MBL-Physiology-Bioinformatics/2021-Bioinformatics-Tutorial-Materials/tree/master/phylogenetics

                Life sciences
                Life sciences

                Comments

                Comment on this article