Computational methods to detect conserved non-genic elements in phylogenetically isolated genomes: application to zebrafish

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Many important model organisms for biomedical and evolutionary research have sequenced genomes, but occupy a phylogenetically isolated position, evolutionarily distant from other sequenced genomes. This phylogenetic isolation is exemplified for zebrafish, a vertebrate model for cis-regulation, development and human disease, whose evolutionary distance to all other currently sequenced fish exceeds the distance between human and chicken. Such large distances make it difficult to align genomes and use them for comparative analysis beyond gene-focused questions. In particular, detecting conserved non-genic elements (CNEs) as promising cis-regulatory elements with biological importance is challenging. Here, we develop a general comparative genomics framework to align isolated genomes and to comprehensively detect CNEs. Our approach integrates highly sensitive and quality-controlled local alignments and uses alignment transitivity and ancestral reconstruction to bridge large evolutionary distances. We apply our framework to zebrafish and demonstrate substantially improved CNE detection and quality compared with previous sets. Our zebrafish CNE set comprises 54 533 CNEs, of which 11 792 (22%) are conserved to human or mouse. Our zebrafish CNEs ( http://zebrafish.stanford.edu) are highly enriched in known enhancers and extend existing experimental (ChIP-Seq) sets. The same framework can now be applied to the isolated genomes of frog, amphioxus, Caenorhabditis elegans and many others.

Related collections

Most cited references 42

Record: found
Abstract: found
Article: not found

Sea anemone genome reveals ancestral eumetazoan gene repertoire and genomic organization.

N. H. Putnam, M Srivastava, U Hellsten … (2007)

Sea anemones are seemingly primitive animals that, along with corals, jellyfish, and hydras, constitute the oldest eumetazoan phylum, the Cnidaria. Here, we report a comparative analysis of the draft genome of an emerging cnidarian model, the starlet sea anemone Nematostella vectensis. The sea anemone genome is complex, with a gene repertoire, exon-intron structure, and large-scale gene linkage more similar to vertebrates than to flies or nematodes, implying that the genome of the eumetazoan ancestor was similarly complex. Nearly one-fifth of the inferred genes of the ancestor are eumetazoan novelties, which are enriched for animal functions like cell signaling, adhesion, and synaptic transmission. Analysis of diverse pathways suggests that these gene "inventions" along the lineage leading to animals were likely already well integrated with preexisting eukaryotic genes in the eumetazoan progenitor.

0 comments Cited 584 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Is Open Access

The amphioxus genome and the evolution of the chordate karyotype.

Nicholas Putnam, Thomas Butts, David Ferrier … (2008)

Lancelets ('amphioxus') are the modern survivors of an ancient chordate lineage, with a fossil record dating back to the Cambrian period. Here we describe the structure and gene content of the highly polymorphic approximately 520-megabase genome of the Florida lancelet Branchiostoma floridae, and analyse it in the context of chordate evolution. Whole-genome comparisons illuminate the murky relationships among the three chordate groups (tunicates, lancelets and vertebrates), and allow not only reconstruction of the gene complement of the last common chordate ancestor but also partial reconstruction of its genomic organization, as well as a description of two genome-wide duplications and subsequent reorganizations in the vertebrate lineage. These genome-scale events shaped the vertebrate genome and provided additional genetic variation for exploitation during vertebrate evolution.

0 comments Cited 554 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Human-mouse alignments with BLASTZ.

Scott Schwartz, W. Kent, Arian Smit … (2003)

The Mouse Genome Analysis Consortium aligned the human and mouse genome sequences for a variety of purposes, using alignment programs that suited the various needs. For investigating issues regarding genome evolution, a particularly sensitive method was needed to permit alignment of a large proportion of the neutrally evolving regions. We selected a program called BLASTZ, an independent implementation of the Gapped BLAST algorithm specifically designed for aligning two long genomic sequences. BLASTZ was subsequently modified, both to attain efficiency adequate for aligning entire mammalian genomes and to increase its sensitivity. This work describes BLASTZ, its modifications, the hardware environment on which we run it, and several empirical studies to validate its results.

0 comments Cited 467 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Nucleic Acids Res

Journal ID (iso-abbrev): Nucleic Acids Res

Journal ID (publisher-id): nar

Journal ID (hwp): nar

Title: Nucleic Acids Research

Publisher: Oxford University Press

ISSN (Print): 0305-1048

ISSN (Electronic): 1362-4962

Publication date (Print): August 2013

Publication date (Electronic): 27 June 2013

Publication date PMC-release: 27 June 2013

Volume: 41

Issue: 15

Page: e151

Affiliations

¹Department of Developmental Biology, Stanford University, Stanford, CA 94305, USA, ²Department of Computer Science, Stanford University, Stanford, CA 94305, USA and ³Department of Electrical Engineering, Stanford University, Stanford, CA 94305, USA

Author notes

*To whom correspondence should be addressed. Tel: +49 351 210 2781; Fax: +49 351 210 1209; Email: hiller@ 123456mpi-cbg.de

Correspondence may also be addressed to Gill Bejerano. Tel: +1 650 723 7666; Fax: +1 650 725 2923; Email: bejerano@ 123456stanford.edu

Present address: Michael Hiller, Computational Biology and Evolutionary Genomics, Max Planck Institute of Molecular Cell Biology and Genetics & Max Planck Institute for the Physics of Complex Systems, Dresden, Germany.

Article

Publisher ID: gkt557

DOI: 10.1093/nar/gkt557

PMC ID: 3753653

PubMed ID: 23814184

SO-VID: 47b9f74e-8b0b-4009-9c2b-770899e7bf38

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

History

Date received : 1 February 2013

Date revision received : 28 May 2013

Date accepted : 30 May 2013

Page count

Pages: 14

Comments

Comment on this article

scite_

Cited by 46

See all cited by

Most referenced authors 3,403

See all reference authors

Computational methods to detect conserved non-genic elements in phylogenetically isolated genomes: application to zebrafish

Read this article at

Abstract

Related collections

G3: Genes|Genomes|Genetics

Most cited references 42

Sea anemone genome reveals ancestral eumetazoan gene repertoire and genomic organization.

The amphioxus genome and the evolution of the chordate karyotype.

Human-mouse alignments with BLASTZ.

Author and article information

Journal

Affiliations

Author notes

Article

History

Page count

Categories

Comments

Comment on this article

Similar content 177

Cited by 46

Most referenced authors 3,403