Integrating whole genome sequencing, methylation, gene expression, topological associated domain information in regulatory mutation prediction: A study of follicular lymphoma

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Graphical abstract

Abstract

A major challenge in human genetics is of the analysis of the interplay between genetic and epigenetic factors in a multifactorial disease like cancer. Here, a novel methodology is proposed to investigate genome-wide regulatory mechanisms in cancer, as studied with the example of follicular Lymphoma (FL). In a first phase, a new machine-learning method is designed to identify Differentially Methylated Regions (DMRs) by computing six attributes. In a second phase, an integrative data analysis method is developed to study regulatory mutations in FL, by considering differential methylation information together with DNA sequence variation, differential gene expression, 3D organization of genome (e.g., topologically associated domains), and enriched biological pathways. Resulting mutation block-gene pairs are further ranked to find out the significant ones. By this approach, BCL2 and BCL6 were identified as top-ranking FL-related genes with several mutation blocks and DMRs acting on their regulatory regions. Two additional genes, CDCA4 and CTSO, were also found in top rank with significant DNA sequence variation and differential methylation in neighboring areas, pointing towards their potential use as biomarkers for FL. This work combines both genomic and epigenomic information to investigate genome-wide gene regulatory mechanisms in cancer and contribute to devising novel treatment strategies.

Related collections

Most cited references 79

Record: found
Abstract: found
Article: not found

featureCounts: an efficient general purpose program for assigning sequence reads to genomic features.

Y. Liao, G K Smyth, W Shi (2014)

Next-generation sequencing technologies generate millions of short sequence reads, which are usually aligned to a reference genome. In many applications, the key information required for downstream analysis is the number of reads mapping to each genomic feature, for example to each exon or each gene. The process of counting reads is called read summarization. Read summarization is required for a great variety of genomic analyses but has so far received relatively little attention in the literature. We present featureCounts, a read summarization program suitable for counting reads generated from either RNA or genomic DNA sequencing experiments. featureCounts implements highly efficient chromosome hashing and feature blocking techniques. It is considerably faster than existing methods (by an order of magnitude for gene-level summarization) and requires far less computer memory. It works with either single or paired-end reads and provides a wide range of options appropriate for different sequencing applications. featureCounts is available under GNU General Public License as part of the Subread (http://subread.sourceforge.net) or Rsubread (http://www.bioconductor.org) software packages.

0 comments Cited 7840 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

BEDTools: a flexible suite of utilities for comparing genomic features

Aaron Quinlan, Ira Hall (2010)

Motivation: Testing for correlations between different sets of genomic features is a fundamental task in genomics research. However, searching for overlaps between features with existing web-based methods is complicated by the massive datasets that are routinely produced with current sequencing technologies. Fast and flexible tools are therefore required to ask complex questions of these data in an efficient manner. Results: This article introduces a new software suite for the comparison, manipulation and annotation of genomic features in Browser Extensible Data (BED) and General Feature Format (GFF) format. BEDTools also supports the comparison of sequence alignments in BAM format to both BED and GFF features. The tools are extremely efficient and allow the user to compare large datasets (e.g. next-generation sequencing data) with both public and custom genome annotation tracks. BEDTools can be combined with one another as well as with standard UNIX commands, thus facilitating routine genomics tasks as well as pipelines that can quickly answer intricate questions of large genomic datasets. Availability and implementation: BEDTools was written in C++. Source code and a comprehensive user manual are freely available at http://code.google.com/p/bedtools Contact: aaronquinlan@gmail.com; imh4y@virginia.edu Supplementary information: Supplementary data are available at Bioinformatics online.

0 comments Cited 7339 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists

Da-Wei Huang, Brad T. Sherman, Richard A. Lempicki (2009)

Functional analysis of large gene lists, derived in most cases from emerging high-throughput genomic, proteomic and bioinformatics scanning approaches, is still a challenging and daunting task. The gene-annotation enrichment analysis is a promising high-throughput strategy that increases the likelihood for investigators to identify biological processes most pertinent to their study. Approximately 68 bioinformatics enrichment tools that are currently available in the community are collected in this survey. Tools are uniquely categorized into three major classes, according to their underlying enrichment algorithms. The comprehensive collections, unique tool classifications and associated questions/issues will provide a more comprehensive and up-to-date view regarding the advantages, pitfalls and recent trends in a simpler tool-class level rather than by a tool-by-tool approach. Thus, the survey will help tool designers/developers and experienced end users understand the underlying algorithms and pertinent details of particular tool categories/tools, enabling them to make the best choices for their particular research interests.

0 comments Cited 2618 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Junbai Wang

Journal

Journal ID (nlm-ta): Comput Struct Biotechnol J

Journal ID (iso-abbrev): Comput Struct Biotechnol J

Title: Computational and Structural Biotechnology Journal

Publisher: Research Network of Computational and Structural Biotechnology

ISSN (Electronic): 2001-0370

Publication date PMC-release: 23 March 2022

Publication date Collection: 2022

Publication date (Electronic): 23 March 2022

Volume: 20

Pages: 1726-1742

Affiliations

[a ]Department of Pathology, Oslo University Hospital - Norwegian Radium Hospital, Oslo, Norway

[b ]Department of Clinical Molecular Biology, Institute of Clinical Medicine, University of Oslo, Norway

[c ]Laboratory Medicine Program, University Health Network and University of Toronto, Toronto, Ontario, Canada

[d ]Department of Clinical Molecular Biology (EpiGen), Akershus University Hospital, Lørenskog, Norway

Author notes

[* ]Corresponding author at: Institute of Clinical Medicine, University of Oslo, Norway. junbai.wang@ 123456medisin.uio.no

Article

Publisher Item ID: S2001-0370(22)00097-6

DOI: 10.1016/j.csbj.2022.03.023

PMC ID: 9024376

PubMed ID: 35495111

SO-VID: b5d1e5f4-3c31-4670-8f5f-ebbb506f12ed

License:

This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

History

Date received : 7 January 2022

Date revision received : 22 March 2022

Date accepted : 22 March 2022

Comments

Comment on this article

scite_

Cited by 2

See all cited by

Most referenced authors 8,317

See all reference authors

Integrating whole genome sequencing, methylation, gene expression, topological associated domain information in regulatory mutation prediction: A study of follicular lymphoma

Read this article at

Graphical abstract

Abstract

Related collections

Genome Integrity

Most cited references 79

featureCounts: an efficient general purpose program for assigning sequence reads to genomic features.

BEDTools: a flexible suite of utilities for comparing genomic features

Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists

Author and article information

Contributors

Journal

Affiliations

Author notes

Article

History

Categories

Comments

Comment on this article

Similar content 78

Cited by 2

Most referenced authors 8,317