Next-generation forward genetic screens: using simulated data to improve the design of mapping-by-sequencing experiments in Arabidopsis

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Forward genetic screens have successfully identified many genes and continue to be powerful tools for dissecting biological processes in Arabidopsis and other model species. Next-generation sequencing technologies have revolutionized the time-consuming process of identifying the mutations that cause a phenotype of interest. However, due to the cost of such mapping-by-sequencing experiments, special attention should be paid to experimental design and technical decisions so that the read data allows to map the desired mutation. Here, we simulated different mapping-by-sequencing scenarios. We first evaluated which short-read technology was best suited for analyzing gene-rich genomic regions in Arabidopsis and determined the minimum sequencing depth required to confidently call single nucleotide variants. We also designed ways to discriminate mutagenesis-induced mutations from background Single Nucleotide Polymorphisms in mutants isolated in Arabidopsis non-reference lines. In addition, we simulated bulked segregant mapping populations for identifying point mutations and monitored how the size of the mapping population and the sequencing depth affect mapping precision. Finally, we provide the computational basis of a protocol that we already used to map T-DNA insertions with paired-end Illumina-like reads, using very low sequencing depths and pooling several mutants together; this approach can also be used with single-end reads as well as to map any other insertional mutagen. All these simulations proved useful for designing experiments that allowed us to map several mutations in Arabidopsis.

Related collections

Most cited references 67

Record: found
Abstract: found
Article: not found

A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data.

Heng Li (2011)

Most existing methods for DNA sequence analysis rely on accurate sequences or genotypes. However, in applications of the next-generation sequencing (NGS), accurate genotypes may not be easily obtained (e.g. multi-sample low-coverage sequencing or somatic mutation discovery). These applications press for the development of new methods for analyzing sequence data with uncertainty. We present a statistical framework for calling SNPs, discovering somatic mutations, inferring population genetical parameters and performing association tests directly based on sequencing data without explicit genotyping or linkage-based imputation. On real data, we demonstrate that our method achieves comparable accuracy to alternative methods for estimating site allele count, for inferring allele frequency spectrum and for association mapping. We also highlight the necessity of using symmetric datasets for finding somatic mutations and confirm that for discovering rare events, mismapping is frequently the leading source of errors. http://samtools.sourceforge.net. hengli@broadinstitute.org.

0 comments Cited 2465 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Genome-wide insertional mutagenesis of Arabidopsis thaliana.

J Alonso (2003)

Over 225,000 independent Agrobacterium transferred DNA (T-DNA) insertion events in the genome of the reference plant Arabidopsis thaliana have been created that represent near saturation of the gene space. The precise locations were determined for more than 88,000 T-DNA insertions, which resulted in the identification of mutations in more than 21,700 of the approximately 29,454 predicted Arabidopsis genes. Genome-wide analysis of the distribution of integration events revealed the existence of a large integration site bias at both the chromosome and gene levels. Insertion mutations were identified in genes that are regulated in response to the plant hormone ethylene.

0 comments Cited 1165 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

High-throughput sequencing technologies.

Jason A. Reuter, Damek Spacek, Michael Snyder (2015)

The human genome sequence has profoundly altered our understanding of biology, human diversity, and disease. The path from the first draft sequence to our nascent era of personal genomes and genomic medicine has been made possible only because of the extraordinary advancements in DNA sequencing technologies over the past 10 years. Here, we discuss commonly used high-throughput sequencing platforms, the growing array of sequencing assays developed around them, as well as the challenges facing current sequencing platforms and their clinical application. Copyright © 2015 Elsevier Inc. All rights reserved.

0 comments Cited 432 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Nucleic Acids Res

Journal ID (iso-abbrev): Nucleic Acids Res

Journal ID (publisher-id): nar

Title: Nucleic Acids Research

Publisher: Oxford University Press

ISSN (Print): 0305-1048

ISSN (Electronic): 1362-4962

Publication date (Print): 02 December 2019

Publication date (Electronic): 23 September 2019

Publication date PMC-release: 23 September 2019

Volume: 47

Issue: 21

Page: e140

Affiliations

Instituto de Bioingeniería, Universidad Miguel Hernández , Campus de Elche, 03202 Elche, Spain

Author notes

To whom correspondence should be addressed. José Luis Micol. Tel: +34 96 665 85 04; Fax: +34 96 665 85 11; Email: jlmicol@ 123456umh.es

Author information

David Wilson-Sánchez http://orcid.org/0000-0002-8905-4026

Samuel Daniel Lup http://orcid.org/0000-0002-7647-1867

Raquel Sarmiento-Mañús http://orcid.org/0000-0001-6929-8034

María Rosa Ponce http://orcid.org/0000-0003-0770-4230

José Luis Micol http://orcid.org/0000-0002-0396-1750

Article

Publisher ID: gkz806

DOI: 10.1093/nar/gkz806

PMC ID: 6868388

PubMed ID: 31544937

SO-VID: f6658c2b-18ba-453a-bcce-ead0031f3798

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@ 123456oup.com

History

Date accepted : 10 September 2019

Date revision received : 07 September 2019

Date received : 30 March 2019

Page count

Pages: 14

Funding

Funded by: Ministerio de Ciencia, Investigación y Universidades of Spain

Award ID: BIO2014-53063-P

Award ID: PGC2018-093445-B-I00

Funded by: Generalitat Valenciana 10.13039/501100003359

Award ID: Prometeo/2019/117

Comments

Comment on this article

scite_

Cited by 5

See all cited by

Most referenced authors 1,484

See all reference authors

Next-generation forward genetic screens: using simulated data to improve the design of mapping-by-sequencing experiments in Arabidopsis

Read this article at

Abstract

Related collections

Arabidopsis genomics

Most cited references 67

A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data.

Genome-wide insertional mutagenesis of Arabidopsis thaliana.

High-throughput sequencing technologies.

Author and article information

Journal

Affiliations

Author notes

Author information

Article

History

Page count

Funding

Categories

Comments

Comment on this article

Similar content 193

Cited by 5

Most referenced authors 1,484