26
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Accuracy of Protein-Protein Binding Sites in High-Throughput Template-Based Modeling

      research-article
      , *
      PLoS Computational Biology
      Public Library of Science

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The accuracy of protein structures, particularly their binding sites, is essential for the success of modeling protein complexes. Computationally inexpensive methodology is required for genome-wide modeling of such structures. For systematic evaluation of potential accuracy in high-throughput modeling of binding sites, a statistical analysis of target-template sequence alignments was performed for a representative set of protein complexes. For most of the complexes, alignments containing all residues of the interface were found. The full interface alignments were obtained even in the case of poor alignments where a relatively small part of the target sequence (as low as 40%) aligned to the template sequence, with a low overall alignment identity (<30%). Although such poor overall alignments might be considered inadequate for modeling of whole proteins, the alignment of the interfaces was strong enough for docking. In the set of homology models built on these alignments, one third of those ranked 1 by a simple sequence identity criteria had RMSD<5 Å, the accuracy suitable for low-resolution template free docking. Such models corresponded to multi-domain target proteins, whereas for single-domain proteins the best models had 5 Å<RMSD<10 Å, the accuracy suitable for less sensitive structure-alignment methods. Overall, ∼50% of complexes with the interfaces modeled by high-throughput techniques had accuracy suitable for meaningful docking experiments. This percentage will grow with the increasing availability of co-crystallized protein-protein complexes.

          Author Summary

          Protein-protein interactions play a central role in life processes at the molecular level. The structural information on these interactions is essential for our understanding of these processes and our ability to design drugs to cure diseases. Limitations of experimental techniques to determine the structure of protein-protein complexes leave the vast majority of these complexes to be determined by computational modeling. The modeling is also important for revealing the mechanisms of the complex formation. The 3D modeling of protein complexes (protein docking) relies on the structure of the individual proteins for the prediction of their assembly. Thus the structural accuracy of the individual proteins, which often are models themselves, is critical for the docking. For the docking purposes, the accuracy of the binding sites is obviously essential, whereas the accuracy of the non-binding regions is less critical. In our study, we systematically analyze the accuracy of the binding sites in protein models produced by high-throughput techniques suitable for large-scale (e.g., genome-wide) studies. The results indicate that this accuracy is adequate for the low- to medium-resolution docking of a significant part of known protein-protein complexes.

          Related collections

          Most cited references32

          • Record: found
          • Abstract: found
          • Article: not found

          Determining the architectures of macromolecular assemblies.

          To understand the workings of a living cell, we need to know the architectures of its macromolecular assemblies. Here we show how proteomic data can be used to determine such structures. The process involves the collection of sufficient and diverse high-quality data, translation of these data into spatial restraints, and an optimization that uses the restraints to generate an ensemble of structures consistent with the data. Analysis of the ensemble produces a detailed architectural map of the assembly. We developed our approach on a challenging model system, the nuclear pore complex (NPC). The NPC acts as a dynamic barrier, controlling access to and from the nucleus, and in yeast is a 50 MDa assembly of 456 proteins. The resulting structure, presented in an accompanying paper, reveals the configuration of the proteins in the NPC, providing insights into its evolution and architectural principles. The present approach should be applicable to many other macromolecular assemblies.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation.

            The detection of ligand-binding sites is often the starting point for protein function identification and drug discovery. Because of inaccuracies in predicted protein structures, extant binding pocket-detection methods are limited to experimentally solved structures. Here, FINDSITE, a method for ligand-binding site prediction and functional annotation based on binding-site similarity across groups of weakly homologous template structures identified from threading, is described. For crystal structures, considering a cutoff distance of 4 A as the hit criterion, the success rate is 70.9% for identifying the best of top five predicted ligand-binding sites with a ranking accuracy of 76.0%. Both high prediction accuracy and ability to correctly rank identified binding sites are sustained when approximate protein models (<35% sequence identity to the closest template structure) are used, showing a 67.3% success rate with 75.5% ranking accuracy. In practice, FINDSITE tolerates structural inaccuracies in protein models up to a rmsd from the crystal structure of 8-10 A. This is because analysis of weakly homologous protein models reveals that about half have a rmsd from the native binding site <2 A. Furthermore, the chemical properties of template-bound ligands can be used to select ligand templates associated with the binding site. In most cases, FINDSITE can accurately assign a molecular function to the protein model.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              The relationship between sequence and interaction divergence in proteins.

              There is currently a gap in knowledge between complexes of known three-dimensional structure and those known from other experimental methods such as affinity purifications or the two-hybrid system. This gap can sometimes be bridged by methods that extrapolate interaction information from one complex structure to homologues of the interacting proteins. To do this, it is important to know if and when proteins of the same type (e.g. family, superfamily or fold) interact in the same way. Here, we study interactions of known structure to address this question. We found all instances within the structural classification of proteins database of the same domain pairs interacting in different complexes, and then compared them with a simple measure (interaction RMSD). When plotted against sequence similarity we find that close homologues (30-40% or higher sequence identity) almost invariably interact the same way. Conversely, similarity only in fold (i.e. without additional evidence for a common ancestor) is only rarely associated with a similarity in interaction. The results suggest that there is a twilight zone of sequence similarity where it is not possible to say whether or not domains will interact similarly. We also discuss the rare instances of fold similarities interacting the same way, and those where obviously homologous proteins interact differently.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS Comput Biol
                plos
                ploscomp
                PLoS Computational Biology
                Public Library of Science (San Francisco, USA )
                1553-734X
                1553-7358
                April 2010
                April 2010
                1 April 2010
                : 6
                : 4
                : e1000727
                Affiliations
                [1]Center for Bioinformatics and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, United States of America
                National Cancer Institute, United States of America and Tel Aviv University, Israel
                Author notes

                Conceived and designed the experiments: PJK IAV. Performed the experiments: PJK. Analyzed the data: PJK IAV. Wrote the paper: PJK.

                Article
                09-PLCB-RA-0593R4
                10.1371/journal.pcbi.1000727
                2848539
                20369011
                6ddc07ee-22d0-472d-9378-745c45d1a8a0
                Kundrotas, Vakser. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
                History
                : 26 May 2009
                : 1 March 2010
                Page count
                Pages: 10
                Categories
                Research Article
                Biophysics/Biomacromolecule-Ligand Interactions
                Biophysics/Macromolecular Assemblies and Machines
                Biophysics/Structural Genomics
                Computational Biology/Genomics
                Computational Biology/Macromolecular Structure Analysis

                Quantitative & Systems biology
                Quantitative & Systems biology

                Comments

                Comment on this article