There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Predicting protein subcellular localization is an important and difficult problem, particularly when query proteins may have the multiplex character, i.e., simultaneously exist at, or move between, two or more different subcellular location sites. Most of the existing protein subcellular location predictor can only be used to deal with the single-location or “singleplex” proteins. Actually, multiple-location or “multiplex” proteins should not be ignored because they usually posses some unique biological functions worthy of our special notice. By introducing the “multi-labeled learning” and “accumulation-layer scale”, a new predictor, called iLoc-Euk, has been developed that can be used to deal with the systems containing both singleplex and multiplex proteins. As a demonstration, the jackknife cross-validation was performed with iLoc-Euk on a benchmark dataset of eukaryotic proteins classified into the following 22 location sites: (1) acrosome, (2) cell membrane, (3) cell wall, (4) centriole, (5) chloroplast, (6) cyanelle, (7) cytoplasm, (8) cytoskeleton, (9) endoplasmic reticulum, (10) endosome, (11) extracellular, (12) Golgi apparatus, (13) hydrogenosome, (14) lysosome, (15) melanosome, (16) microsome (17) mitochondrion, (18) nucleus, (19) peroxisome, (20) spindle pole body, (21) synapse, and (22) vacuole, where none of proteins included has pairwise sequence identity to any other in a same subset. The overall success rate thus obtained by iLoc-Euk was 79%, which is significantly higher than that by any of the existing predictors that also have the capacity to deal with such a complicated and stringent system. As a user-friendly web-server, iLoc-Euk is freely accessible to the public at the web-site http://icpr.jci.edu.cn/bioinfo/iLoc-Euk. It is anticipated that iLoc-Euk may become a useful bioinformatics tool for Molecular Cell Biology, Proteomics, System Biology, and Drug Development Also, its novel approach will further stimulate the development of predicting other protein attributes.

Related collections

Most cited references 78

Record: found
Abstract: found
Article: not found

Gene Ontology: tool for the unification of biology

Michael Ashburner, Catherine A. Ball, Judith Blake … (2002)

Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.

0 comments Cited 15636 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology.

Evelyn Camon, Michele Magrane, Daniel Barrell … (2004)

The Gene Ontology Annotation (GOA) database (http://www.ebi.ac.uk/GOA) aims to provide high-quality electronic and manual annotations to the UniProt Knowledgebase (Swiss-Prot, TrEMBL and PIR-PSD) using the standardized vocabulary of the Gene Ontology (GO). As a supplementary archive of GO annotation, GOA promotes a high level of integration of the knowledge represented in UniProt with other databases. This is achieved by converting UniProt annotation into a recognized computational format. GOA provides annotated entries for nearly 60,000 species (GOA-SPTr) and is the largest and most comprehensive open-source contributor of annotations to the GO Consortium annotation effort. By integrating GO annotations from other model organism groups, GOA consolidates specialized knowledge and expertise to ensure the data remain a key reference for up-to-date biological information. Furthermore, the GOA database fully endorses the Human Proteomics Initiative by prioritizing the annotation of proteins likely to benefit human health and disease. In addition to a non-redundant set of annotations to the human proteome (GOA-Human) and monthly releases of its GO annotation for all species (GOA-SPTr), a series of GO mapping files and specific cross-references in other databases are also regularly distributed. GOA can be queried through a simple user-friendly web interface or downloaded in a parsable format via the EBI and GO FTP websites. The GOA data set can be used to enhance the annotation of particular model organism or gene expression data sets, although increasingly it has been used to evaluate GO predictions generated from text mining or protein interaction experiments. In 2004, the GOA team will build on its success and will continue to supplement the functional annotation of UniProt and work towards enhancing the ability of scientists to access all available biological information. Researchers wishing to query or contribute to the GOA project are encouraged to email: goa@ebi.ac.uk.

0 comments Cited 327 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Predotar: A tool for rapidly screening proteomes for N-terminal targeting sequences.

Ian Small, Nemo Peeters, Fabrice Legeai … (2004)

Probably more than 25% of the proteins encoded by the nuclear genomes of multicellular eukaryotes are targeted to membrane-bound compartments by N-terminal targeting signals. The major signals are those for the endoplasmic reticulum, the mitochondria, and in plants, plastids. The most abundant of these targeted proteins are well-known and well-studied, but a large proportion remain unknown, including most of those involved in regulation of organellar gene expression or regulation of biochemical pathways. The discovery and characterization of these proteins by biochemical means will be long and difficult. An alternative method is to identify candidate organellar proteins via their characteristic N-terminal targeting sequences. We have developed a neural network-based approach (Predotar--Prediction of Organelle Targeting sequences) for identifying genes encoding these proteins amongst eukaryotic genome sequences. The power of this approach for identifying and annotating novel gene families has been illustrated by the discovery of the pentatricopeptide repeat family.

0 comments Cited 260 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

: Role: Editor

Journal

Journal ID (nlm-ta): PLoS One

Journal ID (publisher-id): plos

Journal ID (pmc): plosone

Title: PLoS ONE

Publisher: Public Library of Science (San Francisco, USA )

ISSN (Electronic): 1932-6203

Publication date Collection: 2011

Publication date (Electronic): 30 March 2011

Volume: 6

Issue: 3

Electronic Location Identifier: e18258

Affiliations

[1 ]Gordon Life Science Institute, San Diego, California, United States of America

[2 ]Computer Department, Jing-De-Zhen Ceramic Institute, Jing-De-Zhen, China

Kyushu Institute of Technology, Japan

Author notes

* E-mail: kcchou@ 123456gordonlifescience.org

Conceived and designed the experiments: KCC. Performed the experiments: ZCW XX. Analyzed the data: KCC ZCW XX. Contributed reagents/materials/analysis tools: ZCW XX. Wrote the paper: KCC.

Article

Publisher ID: PONE-D-11-00243

DOI: 10.1371/journal.pone.0018258

PMC ID: 3068162

PubMed ID: 21483473

SO-VID: 89ad3f19-eb01-4026-9cb9-50fa9109ded5

Copyright © Chou et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

History

Date received : 17 December 2010

Date accepted : 24 February 2011

Page count

Pages: 10

Comments

Comment on this article

scite_

Cited by 70

See all cited by

Most referenced authors 1,353

See all reference authors

- Version 1

iLoc-Euk: A Multi-Label Classifier for Predicting the Subcellular Localization of Singleplex and Multiplex Eukaryotic Proteins

Read this article at

Abstract

Related collections

PLOS Climate

Most cited references 78

Gene Ontology: tool for the unification of biology

The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology.

Predotar: A tool for rapidly screening proteomes for N-terminal targeting sequences.

Author and article information

Contributors

Journal

Affiliations

Author notes

Article

History

Page count

Categories

Comments

Comment on this article

Similar content 288

Cited by 70

Most referenced authors 1,353