BIGSdb: Scalable analysis of bacterial genome variation at the population level.

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

The opportunities for bacterial population genomics that are being realised by the application of parallel nucleotide sequencing require novel bioinformatics platforms. These must be capable of the storage, retrieval, and analysis of linked phenotypic and genotypic information in an accessible, scalable and computationally efficient manner.

Related collections

Most cited references 44

Record: found
Abstract: found
Article: not found

The Bioperl toolkit: Perl modules for the life sciences.

Jason E Stajich, David Block, Kris Boulez … (2002)

The Bioperl project is an international open-source collaboration of biologists, bioinformaticians, and computer scientists that has evolved over the past 7 yr into the most comprehensive library of Perl modules available for managing and manipulating life-science information. Bioperl provides an easy-to-use, stable, and consistent programming interface for bioinformatics application programmers. The Bioperl modules have been successfully and repeatedly used to reduce otherwise complex tasks to only a few lines of code. The Bioperl object model has been proven to be flexible enough to support enterprise-level applications such as EnsEMBL, while maintaining an easy learning curve for novice Perl programmers. Bioperl is capable of executing analyses and processing results from programs such as BLAST, ClustalW, or the EMBOSS suite. Interoperation with modules written in Python and Java is supported through the evolving BioCORBA bridge. Bioperl provides access to data stores such as GenBank and SwissProt via a flexible series of sequence input/output modules, and to the emerging common sequence data storage format of the Open Bioinformatics Database Access project. This study describes the overall architecture of the toolkit, the problem domains that it addresses, and gives specific examples of how the toolkit can be used to solve common life-sciences problems. We conclude with a discussion of how the open-source nature of the project has contributed to the development effort.

0 comments Cited 718 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

eBURST: inferring patterns of evolutionary descent among clusters of related bacterial genotypes from multilocus sequence typing data.

E Feil, B. C. Li, D M Aanensen … (2004)

The introduction of multilocus sequence typing (MLST) for the precise characterization of isolates of bacterial pathogens has had a marked impact on both routine epidemiological surveillance and microbial population biology. In both fields, a key prerequisite for exploiting this resource is the ability to discern the relatedness and patterns of evolutionary descent among isolates with similar genotypes. Traditional clustering techniques, such as dendrograms, provide a very poor representation of recent evolutionary events, as they attempt to reconstruct relationships in the absence of a realistic model of the way in which bacterial clones emerge and diversify to form clonal complexes. An increasingly popular approach, called BURST, has been used as an alternative, but present implementations are unable to cope with very large data sets and offer crude graphical outputs. Here we present a new implementation of this algorithm, eBURST, which divides an MLST data set of any size into groups of related isolates and clonal complexes, predicts the founding (ancestral) genotype of each clonal complex, and computes the bootstrap support for the assignment. The most parsimonious patterns of descent of all isolates in each clonal complex from the predicted founder(s) are then displayed. The advantages of eBURST for exploring patterns of evolutionary descent are demonstrated with a number of examples, including the simple Spain(23F)-1 clonal complex of Streptococcus pneumoniae, "population snapshots" of the entire S. pneumoniae and Staphylococcus aureus MLST databases, and the more complicated clonal complexes observed for Campylobacter jejuni and Neisseria meningitidis.

0 comments Cited 607 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms.

M. C. J. Maiden, J. Bygraves, E Feil … (1998)

Traditional and molecular typing schemes for the characterization of pathogenic microorganisms are poorly portable because they index variation that is difficult to compare among laboratories. To overcome these problems, we propose multilocus sequence typing (MLST), which exploits the unambiguous nature and electronic portability of nucleotide sequence data for the characterization of microorganisms. To evaluate MLST, we determined the sequences of approximately 470-bp fragments from 11 housekeeping genes in a reference set of 107 isolates of Neisseria meningitidis from invasive disease and healthy carriers. For each locus, alleles were assigned arbitrary numbers and dendrograms were constructed from the pairwise differences in multilocus allelic profiles by cluster analysis. The strain associations obtained were consistent with clonal groupings previously determined by multilocus enzyme electrophoresis. A subset of six gene fragments was chosen that retained the resolution and congruence achieved by using all 11 loci. Most isolates from hyper-virulent lineages of serogroups A, B, and C meningococci were identical for all loci or differed from the majority type at only a single locus. MLST using six loci therefore reliably identified the major meningococcal lineages associated with invasive disease. MLST can be applied to almost all bacterial species and other haploid organisms, including those that are difficult to cultivate. The overwhelming advantage of MLST over other molecular typing methods is that sequence data are truly portable between laboratories, permitting one expanding global database per species to be placed on a World-Wide Web site, thus enabling exchange of molecular typing data for global epidemiology via the Internet.

0 comments Cited 602 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (iso-abbrev): BMC Bioinformatics

Title: BMC bioinformatics

Publisher: Springer Science and Business Media LLC

ISSN (Electronic): 1471-2105

ISSN (Print): 1471-2105

Publication date (Electronic): Dec 10 2010

Volume: 11

Affiliations

[1 ] Department of Zoology, University of Oxford, UK. keith.jolley@zoo.ox.ac.uk

Article

Publisher Item ID: 1471-2105-11-595

DOI: 10.1186/1471-2105-11-595

PMC ID: 3004885

PubMed ID: 21143983

SO-VID: a5f9643f-d6ae-40cf-abb7-68c7cb03fb8c

History

Data availability:

Comments

Comment on this article

scite_

Cited by 1,004

See all cited by

Most referenced authors 2,007

See all reference authors

- Version 1
- Version 1

BIGSdb: Scalable analysis of bacterial genome variation at the population level.

Read this article at

Abstract

Related collections

Genome Integrity

Most cited references 44

The Bioperl toolkit: Perl modules for the life sciences.

eBURST: inferring patterns of evolutionary descent among clusters of related bacterial genotypes from multilocus sequence typing data.

Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms.

Author and article information

Journal

Affiliations

Article

History

Comments

Comment on this article

Similar content 232

Cited by 1,004

Most referenced authors 2,007