UniRef: comprehensive and non-redundant UniProt reference clusters.

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Redundant protein sequences in biological databases hinder sequence similarity searches and make interpretation of search results difficult. Clustering of protein sequence space based on sequence similarity helps organize all sequences into manageable datasets and reduces sampling bias and overrepresentation of sequences.

Related collections

Author and article information

Journal

Journal ID (iso-abbrev): Bioinformatics

Title: Bioinformatics (Oxford, England)

Publisher: Oxford University Press (OUP)

ISSN (Electronic): 1367-4811

ISSN (Print): 1367-4803

Publication date (Electronic): May 15 2007

Volume: 23

Issue: 10

Affiliations

[1 ] Protein Information Resource, Department of Biochemistry and Molecular & Cellular Biology, Georgetown University Medical Center, Washington, DC 20007, USA. bes23@georgetown.edu

Article

Publisher Item ID: btm098

DOI: 10.1093/bioinformatics/btm098

PubMed ID: 17379688

SO-VID: 5767b4e9-3fc6-4c72-98c9-4ab76f4bb001

History

Data availability:

Comments

Comment on this article

scite_

1,399

1,190

Smart Citations

1,399

1,190

Citing PublicationsSupportingMentioningContrasting

View Citations

See how this article has been cited at scite.ai

scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.