High-resolution TADs reveal DNA sequences underlying genome organization in flies

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Despite an abundance of new studies about topologically associating domains (TADs), the role of genetic information in TAD formation is still not fully understood. Here we use our software, HiCExplorer ( hicexplorer.readthedocs.io) to annotate >2800 high-resolution (570 bp) TAD boundaries in Drosophila melanogaster. We identify eight DNA motifs enriched at boundaries, including a motif bound by the M1BP protein, and two new boundary motifs. In contrast to mammals, the CTCF motif is only enriched on a small fraction of boundaries flanking inactive chromatin while most active boundaries contain the motifs bound by the M1BP or Beaf-32 proteins. We demonstrate that boundaries can be accurately predicted using only the motif sequences at open chromatin sites. We propose that DNA sequence guides the genome architecture by allocation of boundary proteins in the genome. Finally, we present an interactive online database to access and explore the spatial organization of fly, mouse and human genomes, available at http://chorogenome.ie-freiburg.mpg.de.

Abstract

Although topologically associating domains (TADs) have been extensively investigated, it is not clear to what extent DNA sequence contributes to their formation. Here the authors develop software to identify high-resolution TAD boundaries and reveal their relationship to underlying DNA motifs.

Related collections

Most cited references 38

Record: found
Abstract: found
Article: found

Is Open Access

Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences

Jeremy Goecks, Anton Nekrutenko, James E. Taylor (2010)

Increased reliance on computational approaches in the life sciences has revealed grave concerns about how accessible and reproducible computation-reliant results truly are. Galaxy http://usegalaxy.org, an open web-based platform for genomic research, addresses these problems. Galaxy automatically tracks and manages data provenance and provides support for capturing the context and intent of computational methods. Galaxy Pages are interactive, web-based documents that provide users with a medium to communicate a complete computational analysis.

0 comments Cited 1421 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Identifying ChIP-seq enrichment using MACS.

Jianxing Feng, Tao Liu, Bo Qin … (2012)

Model-based analysis of ChIP-seq (MACS) is a computational algorithm that identifies genome-wide locations of transcription/chromatin factor binding or histone modification from ChIP-seq data. MACS consists of four steps: removing redundant reads, adjusting read position, calculating peak enrichment and estimating the empirical false discovery rate (FDR). In this protocol, we provide a detailed demonstration of how to install MACS and how to use it to analyze three common types of ChIP-seq data sets with different characteristics: the sequence-specific transcription factor FoxA1, the histone modification mark H3K4me3 with sharp enrichment and the H3K36me3 mark with broad enrichment. We also explain how to interpret and visualize the results of MACS analyses. The algorithm requires ∼3 GB of RAM and 1.5 h of computing time to analyze a ChIP-seq data set containing 30 million reads, an estimate that increases with sequence coverage. MACS is open source and is available from http://liulab.dfci.harvard.edu/MACS/.

0 comments Cited 816 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

featureCounts: An efficient general-purpose program for assigning sequence reads to genomic features

, , (2013)

Next-generation sequencing technologies generate millions of short sequence reads, which are usually aligned to a reference genome. In many applications, the key information required for downstream analysis is the number of reads mapping to each genomic feature, for example to each exon or each gene. The process of counting reads is called read summarization. Read summarization is required for a great variety of genomic analyses but has so far received relatively little attention in the literature. We present featureCounts, a read summarization program suitable for counting reads generated from either RNA or genomic DNA sequencing experiments. featureCounts implements highly efficient chromosome hashing and feature blocking techniques. It is considerably faster than existing methods (by an order of magnitude for gene-level summarization) and requires far less computer memory. It works with either single or paired-end reads and provides a wide range of options appropriate for different sequencing applications. featureCounts is available under GNU General Public License as part of the Subread (http://subread.sourceforge.net) or Rsubread (http://www.bioconductor.org) software packages.

0 comments Cited 768 times – based on 0 reviews

Preprint

     Review now

Bookmark

All references

Author and article information

Contributors

Thomas Manke:

ORCID: http://orcid.org/0000-0003-3702-0868

manke@ie-freiburg.mpg.de

Journal

Journal ID (nlm-ta): Nat Commun

Journal ID (iso-abbrev): Nat Commun

Title: Nature Communications

Publisher: Nature Publishing Group UK (London )

ISSN (Electronic): 2041-1723

Publication date (Electronic): 15 January 2018

Publication date PMC-release: 15 January 2018

Publication date Collection: 2018

Volume: 9

Electronic Location Identifier: 189

Affiliations

[1 ]ISNI 0000 0004 0491 4256, GRID grid.429509.3, Max Planck Institute of Immunobiology and Epigenetics, ; Stübeweg 51, 79108 Freiburg, Germany

[2 ]GRID grid.5963.9, Faculty of Biology, University of Freiburg, ; Schänzlestraße 1, 79104 Freiburg, Germany

[3 ]GRID grid.5963.9, University of Freiburg, Department of Computer Science, ; Georges-Köhler-Allee 106, 79110 Freiburg, Germany

[4 ]Max Planck Institute of Biochemistry and Computational Biology, Am Klopferspitz 18, 82152 Martinsried, Germany

Author information

Fidel Ramírez http://orcid.org/0000-0002-9142-417X

Vivek Bhardwaj http://orcid.org/0000-0002-5570-9338

Laura Arrigoni http://orcid.org/0000-0002-3626-4468

Björn A. Grüning http://orcid.org/0000-0002-3079-6586

Thomas Manke http://orcid.org/0000-0003-3702-0868

Article

Publisher ID: 2525

DOI: 10.1038/s41467-017-02525-w

PMC ID: 5768762

PubMed ID: 29335486

SO-VID: bc5618ce-56ab-4c0a-8c21-8edc08228295

License:

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

History

Date received : 26 June 2017

Date accepted : 6 December 2017

Custom metadata

ScienceOpen disciplines: Uncategorized

Data availability:

ScienceOpen disciplines: Uncategorized

Comments

Comment on this article

scite_

Cited by 376

See all cited by

Most referenced authors 1,524

See all reference authors

- Version 1

High-resolution TADs reveal DNA sequences underlying genome organization in flies

Read this article at

Abstract

Abstract

Related collections

Higher order chromatin architecture

Most cited references 38

Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences

Identifying ChIP-seq enrichment using MACS.

featureCounts: An efficient general-purpose program for assigning sequence reads to genomic features

Author and article information

Contributors

Journal

Affiliations

Author information

Article

History

Categories

Custom metadata

Comments

Comment on this article

Similar content 227

Cited by 376

Most referenced authors 1,524