12
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      High-Resolution Identification of Specificity Determining Positions in the LacI Protein Family Using Ensembles of Sub-Sampled Alignments

      research-article
      1 , 2 , 1 , 2 , *
      PLoS ONE
      Public Library of Science

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Since the advent of large-scale genomic sequencing, and the consequent availability of large numbers of homologous protein sequences, there has been burgeoning development of methods for extracting functional information from multiple sequence alignments (MSAs). One type of analysis seeks to identify specificity determining positions (SDPs) based on the assumption that such positions are highly conserved within groups of sequences sharing functional specificity, but conserved to different amino acids in different specificity groups. This unsupervised approach to utilizing evolutionary information may elucidate mechanisms of specificity in protein-protein interactions, catalytic activity of enzymes, sensitivity to allosteric regulation, and other types of protein functionality. We present an analysis of SDPs in the LacI family of transcriptional regulators in which we 1) relax the constraint that all specificity groups must contribute to SDP signal, and 2) use a novel approach to robust treatment of sequence alignment uncertainty based on sub-sampling. We find that the vast majority of SDP signal occurs at positions with a conservation pattern that significantly complicates detection by previously described methods. This pattern, which we term “partial SDP”, consists of the commonly accepted SDP conservation pattern among a subset of specificity groups and strong degeneracy among the rest. An upshot of this fact is that the SDP complement of every specificity group appears to be unique. Additionally, sub-sampling gives us the ability to assign a confidence interval to the SDP score, as well as increase fidelity, as compared to analysis of a single, comprehensive alignment—the current standard in multiple sequence alignment methodologies.

          Related collections

          Most cited references36

          • Record: found
          • Abstract: found
          • Article: not found

          An evolutionary trace method defines binding surfaces common to protein families.

          X-ray or NMR structures of proteins are often derived without their ligands, and even when the structure of a full complex is available, the area of contact that is functionally and energetically significant may be a specialized subset of the geometric interface deduced from the spatial proximity between ligands. Thus, even after a structure is solved, it remains a major theoretical and experimental goal to localize protein functional interfaces and understand the role of their constituent residues. The evolutionary trace method is a systematic, transparent and novel predictive technique that identifies active sites and functional interfaces in proteins with known structure. It is based on the extraction of functionally important residues from sequence conservation patterns in homologous proteins, and on their mapping onto the protein surface to generate clusters identifying functional interfaces. The SH2 and SH3 modular signaling domains and the DNA binding domain of the nuclear hormone receptors provide tests for the accuracy and validity of our method. In each case, the evolutionary trace delineates the functional epitope and identifies residues critical to binding specificity. Based on mutational evolutionary analysis and on the structural homology of protein families, this simple and versatile approach should help focus site-directed mutagenesis studies of structure-function relationships in macromolecules, as well as studies of specificity in molecular recognition. More generally, it provides an evolutionary perspective for judging the functional or structural role of each residue in protein structure.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Ensembl Genomes 2016: more genomes, more complexity

            Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including reference sequence, gene models, transcriptional data, genetic variation and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments. These include the development of new analyses and views to represent polyploid genomes (of which bread wheat is the primary exemplar); and the continued up-scaling of the resource, which now includes over 23 000 bacterial genomes, 400 fungal genomes and 100 protist genomes, in addition to 55 genomes from invertebrate metazoa and 39 genomes from plants. This dramatic increase in the number of included genomes is one part of a broader effort to automate the integration of archival data (genome sequence, but also associated RNA sequence data and variant calls) within the context of reference genomes and make it available through the Ensembl user interfaces.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Splitting pairs: the diverging fates of duplicated genes.

              Many genes are members of large families that have arisen during evolution through gene duplication events. Our increasing understanding of gene organization at the scale of whole genomes is revealing further evidence for the extensive retention of genes that arise during duplication events of various types. Duplication is thought to be an important means of providing a substrate on which evolution can work. An understanding of gene duplication and its resolution is crucial for revealing mechanisms of genetic redundancy. Here, we consider both the theoretical framework and the experimental evidence to explain the preservation of duplicated genes.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS One
                PLoS ONE
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, CA USA )
                1932-6203
                2016
                28 September 2016
                : 11
                : 9
                : e0162579
                Affiliations
                [1 ]Biomedical Engineering Department, Washington University in St. Louis, St. Louis, Missouri, 63130, United States of America
                [2 ]Center for Biological Systems Engineering, Washington University in St. Louis, St. Louis, Missouri, 63130, United States of America
                University of Edinburgh, UNITED KINGDOM
                Author notes

                Competing Interests: The authors have declared that no competing interests exist.

                • Conceptualization: RS KMN.

                • Data curation: RS.

                • Formal analysis: RS.

                • Funding acquisition: KMN.

                • Investigation: RS.

                • Methodology: RS.

                • Software: RS.

                • Supervision: KMN.

                • Validation: RS.

                • Visualization: RS.

                • Writing – original draft: RS.

                • Writing – review & editing: RS KMN.

                Author information
                http://orcid.org/0000-0001-7146-9592
                Article
                PONE-D-16-29199
                10.1371/journal.pone.0162579
                5040260
                27681038
                25390aa0-fd24-47e7-a7f3-d8a83f18248c
                © 2016 Sloutsky, Naegle

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 22 July 2016
                : 8 August 2016
                Page count
                Figures: 8, Tables: 0, Pages: 21
                Funding
                Computations were performed in part using the facilities of the Washington University Center for High Performance Computing, which were partially funded by NIH grants (1S10RR022984-01A1 and 1S10OD018091-01). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Research and Analysis Methods
                Computational Techniques
                Split-Decomposition Method
                Multiple Alignment Calculation
                Biology and Life Sciences
                Molecular Biology
                Molecular Biology Techniques
                Sequencing Techniques
                Sequence Analysis
                Sequence Alignment
                Research and Analysis Methods
                Molecular Biology Techniques
                Sequencing Techniques
                Sequence Analysis
                Sequence Alignment
                Biology and Life Sciences
                Molecular Biology
                Molecular Biology Techniques
                Sequencing Techniques
                Sequence Analysis
                Research and Analysis Methods
                Molecular Biology Techniques
                Sequencing Techniques
                Sequence Analysis
                Biology and Life Sciences
                Molecular Biology
                Macromolecular Structure Analysis
                Protein Structure
                Protein Structure Comparison
                Biology and Life Sciences
                Biochemistry
                Proteins
                Protein Structure
                Protein Structure Comparison
                Biology and Life Sciences
                Biochemistry
                Enzymology
                Enzyme Chemistry
                Enzyme Regulation
                Allosteric Regulation
                Biology and Life Sciences
                Biochemistry
                Proteins
                Allosteric Regulation
                Biology and Life Sciences
                Molecular Biology
                Molecular Biology Techniques
                Sequencing Techniques
                Protein Sequencing
                Research and Analysis Methods
                Molecular Biology Techniques
                Sequencing Techniques
                Protein Sequencing
                Biology and Life Sciences
                Molecular Biology
                Molecular Biology Techniques
                Sequencing Techniques
                Sequence Analysis
                Amino Acid Sequence Analysis
                Research and Analysis Methods
                Molecular Biology Techniques
                Sequencing Techniques
                Sequence Analysis
                Amino Acid Sequence Analysis
                Engineering and Technology
                Signal Processing
                Signal Filtering
                Custom metadata
                The code and the input sequence data for results presented in the paper have been uploaded to Figshare at DOI 10.6084/m9.figshare.3792930. The code is also available from http://naegle.wustl.edu/software, which will be updated with improved versions of the code as development continues.

                Uncategorized
                Uncategorized

                Comments

                Comment on this article