Accuracy of Protein-Protein Binding Sites in High-Throughput Template-Based Modeling

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

The accuracy of protein structures, particularly their binding sites, is essential for the success of modeling protein complexes. Computationally inexpensive methodology is required for genome-wide modeling of such structures. For systematic evaluation of potential accuracy in high-throughput modeling of binding sites, a statistical analysis of target-template sequence alignments was performed for a representative set of protein complexes. For most of the complexes, alignments containing all residues of the interface were found. The full interface alignments were obtained even in the case of poor alignments where a relatively small part of the target sequence (as low as 40%) aligned to the template sequence, with a low overall alignment identity (<30%). Although such poor overall alignments might be considered inadequate for modeling of whole proteins, the alignment of the interfaces was strong enough for docking. In the set of homology models built on these alignments, one third of those ranked 1 by a simple sequence identity criteria had RMSD<5 Å, the accuracy suitable for low-resolution template free docking. Such models corresponded to multi-domain target proteins, whereas for single-domain proteins the best models had 5 Å<RMSD<10 Å, the accuracy suitable for less sensitive structure-alignment methods. Overall, ∼50% of complexes with the interfaces modeled by high-throughput techniques had accuracy suitable for meaningful docking experiments. This percentage will grow with the increasing availability of co-crystallized protein-protein complexes.

Author Summary

Protein-protein interactions play a central role in life processes at the molecular level. The structural information on these interactions is essential for our understanding of these processes and our ability to design drugs to cure diseases. Limitations of experimental techniques to determine the structure of protein-protein complexes leave the vast majority of these complexes to be determined by computational modeling. The modeling is also important for revealing the mechanisms of the complex formation. The 3D modeling of protein complexes (protein docking) relies on the structure of the individual proteins for the prediction of their assembly. Thus the structural accuracy of the individual proteins, which often are models themselves, is critical for the docking. For the docking purposes, the accuracy of the binding sites is obviously essential, whereas the accuracy of the non-binding regions is less critical. In our study, we systematically analyze the accuracy of the binding sites in protein models produced by high-throughput techniques suitable for large-scale (e.g., genome-wide) studies. The results indicate that this accuracy is adequate for the low- to medium-resolution docking of a significant part of known protein-protein complexes.

Related collections

Most cited references 32

Record: found
Abstract: found
Article: not found

Determining the architectures of macromolecular assemblies.

Orit Karni-Schmidt, Julia Kipper, Andrej Sali … (2007)

To understand the workings of a living cell, we need to know the architectures of its macromolecular assemblies. Here we show how proteomic data can be used to determine such structures. The process involves the collection of sufficient and diverse high-quality data, translation of these data into spatial restraints, and an optimization that uses the restraints to generate an ensemble of structures consistent with the data. Analysis of the ensemble produces a detailed architectural map of the assembly. We developed our approach on a challenging model system, the nuclear pore complex (NPC). The NPC acts as a dynamic barrier, controlling access to and from the nucleus, and in yeast is a 50 MDa assembly of 456 proteins. The resulting structure, presented in an accompanying paper, reveals the configuration of the proteins in the NPC, providing insights into its evolution and architectural principles. The present approach should be applicable to many other macromolecular assemblies.

0 comments Cited 142 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation.

Jeffrey Skolnick, Michal Brylinski (2008)

The detection of ligand-binding sites is often the starting point for protein function identification and drug discovery. Because of inaccuracies in predicted protein structures, extant binding pocket-detection methods are limited to experimentally solved structures. Here, FINDSITE, a method for ligand-binding site prediction and functional annotation based on binding-site similarity across groups of weakly homologous template structures identified from threading, is described. For crystal structures, considering a cutoff distance of 4 A as the hit criterion, the success rate is 70.9% for identifying the best of top five predicted ligand-binding sites with a ranking accuracy of 76.0%. Both high prediction accuracy and ability to correctly rank identified binding sites are sustained when approximate protein models (<35% sequence identity to the closest template structure) are used, showing a 67.3% success rate with 75.5% ranking accuracy. In practice, FINDSITE tolerates structural inaccuracies in protein models up to a rmsd from the crystal structure of 8-10 A. This is because analysis of weakly homologous protein models reveals that about half have a rmsd from the native binding site <2 A. Furthermore, the chemical properties of template-bound ligands can be used to select ligand templates associated with the binding site. In most cases, FINDSITE can accurately assign a molecular function to the protein model.

0 comments Cited 103 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

The relationship between sequence and interaction divergence in proteins.

Patrick Aloy, Hugo Ceulemans, Alexander Stark … (2003)

There is currently a gap in knowledge between complexes of known three-dimensional structure and those known from other experimental methods such as affinity purifications or the two-hybrid system. This gap can sometimes be bridged by methods that extrapolate interaction information from one complex structure to homologues of the interacting proteins. To do this, it is important to know if and when proteins of the same type (e.g. family, superfamily or fold) interact in the same way. Here, we study interactions of known structure to address this question. We found all instances within the structural classification of proteins database of the same domain pairs interacting in different complexes, and then compared them with a simple measure (interaction RMSD). When plotted against sequence similarity we find that close homologues (30-40% or higher sequence identity) almost invariably interact the same way. Conversely, similarity only in fold (i.e. without additional evidence for a common ancestor) is only rarely associated with a similarity in interaction. The results suggest that there is a twilight zone of sequence similarity where it is not possible to say whether or not domains will interact similarly. We also discuss the rare instances of fold similarities interacting the same way, and those where obviously homologous proteins interact differently.

0 comments Cited 91 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

: Role: Editor

Journal

Journal ID (nlm-ta): PLoS Comput Biol

Journal ID (publisher-id): plos

Journal ID (pmc): ploscomp

Title: PLoS Computational Biology

Publisher: Public Library of Science (San Francisco, USA )

ISSN (Print): 1553-734X

ISSN (Electronic): 1553-7358

Publication date Collection: April 2010

Publication date (Print): April 2010

Publication date (Electronic): 1 April 2010

Volume: 6

Issue: 4

Electronic Location Identifier: e1000727

Affiliations

[1]Center for Bioinformatics and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, United States of America

National Cancer Institute, United States of America and Tel Aviv University, Israel

Author notes

* E-mail: vakser@ 123456ku.edu.

Conceived and designed the experiments: PJK IAV. Performed the experiments: PJK. Analyzed the data: PJK IAV. Wrote the paper: PJK.

Article

Publisher ID: 09-PLCB-RA-0593R4

DOI: 10.1371/journal.pcbi.1000727

PMC ID: 2848539

PubMed ID: 20369011

SO-VID: 6ddc07ee-22d0-472d-9378-745c45d1a8a0

Copyright © Kundrotas, Vakser. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

History

Date received : 26 May 2009

Date accepted : 1 March 2010

Page count

Pages: 10

Comments

Comment on this article

scite_

Cited by 9

See all cited by

Most referenced authors 1,090

See all reference authors

Accuracy of Protein-Protein Binding Sites in High-Throughput Template-Based Modeling

Read this article at

Abstract

Author Summary

Related collections

Journal of Systems Thinking

Most cited references 32

Determining the architectures of macromolecular assemblies.

A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation.

The relationship between sequence and interaction divergence in proteins.

Author and article information

Contributors

Journal

Affiliations

Author notes

Article

History

Page count

Categories

Comments

Comment on this article

Similar content 36

Cited by 9

Most referenced authors 1,090