Deciphering complex genome rearrangements in <i>C. elegans</i> using short-read whole genome sequencing

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Genomic rearrangements cause congenital disorders, cancer, and complex diseases in human. Yet, they are still understudied in rare diseases because their detection is challenging, despite the advent of whole genome sequencing (WGS) technologies. Short-read (srWGS) and long-read WGS approaches are regularly compared, and the latter is commonly recommended in studies focusing on genomic rearrangements. However, srWGS is currently the most economical, accurate, and widely supported technology. In Caenorhabditis elegans ( C. elegans), such variants, induced by various mutagenesis processes, have been used for decades to balance large genomic regions by preventing chromosomal crossover events and allowing the maintenance of lethal mutations. Interestingly, those chromosomal rearrangements have rarely been characterized on a molecular level. To evaluate the ability of srWGS to detect various types of complex genomic rearrangements, we sequenced three balancer strains using short-read Illumina technology. As we experimentally validated the breakpoints uncovered by srWGS, we showed that, by combining several types of analyses, srWGS enables the detection of a reciprocal translocation ( eT1), a free duplication ( sDp3), a large deletion ( sC4), and chromoanagenesis events. Thus, applying srWGS to decipher real complex genomic rearrangements in model organisms may help designing efficient bioinformatics pipelines with systematic detection of complex rearrangements in human genomes.

Related collections

Most cited references 58

Record: found
Abstract: found
Article: found

Is Open Access

Trimmomatic: a flexible trimmer for Illumina sequence data

Anthony M. Bolger, Marc Lohse, Bjoern Usadel (2014)

Motivation: Although many next-generation sequencing (NGS) read preprocessing tools already existed, we could not find any tool or combination of tools that met our requirements in terms of flexibility, correct handling of paired-end data and high performance. We have developed Trimmomatic as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data. Results: The value of NGS read preprocessing is demonstrated for both reference-based and reference-free tasks. Trimmomatic is shown to produce output that is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested. Availability and implementation: Trimmomatic is licensed under GPL V3. It is cross-platform (Java 1.5+ required) and available at http://www.usadellab.org/cms/index.php?page=trimmomatic Contact: usadel@bio1.rwth-aachen.de Supplementary information: Supplementary data are available at Bioinformatics online.

0 comments Cited 16562 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

The Sequence Alignment/Map format and SAMtools

Heng Li, Bob Handsaker, Alec Wysoker … (2009)

Summary: The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against reference sequences, supporting short and long reads (up to 128 Mbp) produced by different sequencing platforms. It is flexible in style, compact in size, efficient in random access and is the format in which alignments from the 1000 Genomes Project are released. SAMtools implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments. Availability: http://samtools.sourceforge.net Contact: rd@sanger.ac.uk

0 comments Cited 14456 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

The mutational constraint spectrum quantified from variation in 141,456 humans

Konrad J. Karczewski, Laurent C. Francioli, Grace Tiao … (2021)

Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism will be depleted of such variants in natural populations, whereas non-essential genes will tolerate their accumulation. However, predicted loss-of-function variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes 1 . Here we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence predicted loss-of-function variants in this cohort after filtering for artefacts caused by sequencing and annotation errors. Using an improved model of human mutation rates, we classify human protein-coding genes along a spectrum that represents tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve the power of gene discovery for both common and rare diseases.

0 comments Cited 3489 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Maja Tarailo-Graovac: maja.tarailograovac@ucalgary.ca

Journal

Journal ID (nlm-ta): Sci Rep

Journal ID (iso-abbrev): Sci Rep

Title: Scientific Reports

Publisher: Nature Publishing Group UK (London )

ISSN (Electronic): 2045-2322

Publication date (Electronic): 14 September 2021

Publication date PMC-release: 14 September 2021

Publication date Collection: 2021

Volume: 11

Electronic Location Identifier: 18258

Affiliations

[1 ]GRID grid.22072.35, ISNI 0000 0004 1936 7697, Departments of Biochemistry, Molecular Biology and Medical Genetics, Cumming School of Medicine, , University of Calgary, ; Calgary, AB T2N 4N1 Canada

[2 ]GRID grid.22072.35, ISNI 0000 0004 1936 7697, Alberta Children’s Hospital Research Institute, , University of Calgary, ; Calgary, AB T2N 4N1 Canada

Article

Publisher ID: 97764

DOI: 10.1038/s41598-021-97764-9

PMC ID: 8440550

SO-VID: 9fd2e7cd-4ba8-4d1a-a9a7-633cc18294f3

License:

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

History

Date received : 6 July 2021

Date accepted : 30 August 2021

Funding

Funded by: Alberta Children’s Hospital Research Institute Foundation

Funded by: FundRef http://dx.doi.org/10.13039/100008762, Genome Canada;

Award ID: 275SIL

Funded by: FundRef http://dx.doi.org/10.13039/501100000024, Canadian Institutes of Health Research;

Award ID: GP1-155868

Award ID: PJT-156068

Funded by: Eyes High Postdoctoral Fellowship

Custom metadata

ScienceOpen disciplines: Uncategorized

Keywords: computational biology and bioinformatics,genetics

Data availability:

ScienceOpen disciplines: Uncategorized

Keywords: computational biology and bioinformatics, genetics

Comments

Comment on this article

scite_

Cited by 8

See all cited by

Most referenced authors 1,467

See all reference authors

Deciphering complex genome rearrangements in C. elegans using short-read whole genome sequencing

Read this article at

Abstract

Related collections

Genome Engineering using CRISPR

Most cited references 58

Trimmomatic: a flexible trimmer for Illumina sequence data

The Sequence Alignment/Map format and SAMtools

The mutational constraint spectrum quantified from variation in 141,456 humans

Author and article information

Contributors

Journal

Affiliations

Article

History

Funding

Categories

Custom metadata

Comments

Comment on this article

Similar content 145

Cited by 8

Most referenced authors 1,467