VCPA: genomic variant calling pipeline and data management tool for Alzheimer’s Disease Sequencing Project

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Summary

We report VCPA, our SNP/Indel Variant Calling Pipeline and data management tool used for the analysis of whole genome and exome sequencing (WGS/WES) for the Alzheimer’s Disease Sequencing Project. VCPA consists of two independent but linkable components: pipeline and tracking database. The pipeline, implemented using the Workflow Description Language and fully optimized for the Amazon elastic compute cloud environment, includes steps from aligning raw sequence reads to variant calling using GATK. The tracking database allows users to view job running status in real time and visualize >100 quality metrics per genome. VCPA is functionally equivalent to the CCDG/TOPMed pipeline. Users can use the pipeline and the dockerized database to process large WGS/WES datasets on Amazon cloud with minimal configuration.

Availability and implementation

VCPA is released under the MIT license and is available for academic and nonprofit use for free. The pipeline source code and step-by-step instructions are available from the National Institute on Aging Genetics of Alzheimer’s Disease Data Storage Site ( http://www.niagads.org/VCPA).

Supplementary information

Supplementary data are available at Bioinformatics online.

Related collections

Most cited references 4

Record: found
Abstract: found
Article: found

Is Open Access

SAMBLASTER: fast duplicate marking and structural variant read extraction

Gregory Faust, Ira Hall (2014)

Motivation: Illumina DNA sequencing is now the predominant source of raw genomic data, and data volumes are growing rapidly. Bioinformatic analysis pipelines are having trouble keeping pace. A common bottleneck in such pipelines is the requirement to read, write, sort and compress large BAM files multiple times. Results: We present SAMBLASTER, a tool that reduces the number of times such costly operations are performed. SAMBLASTER is designed to mark duplicates in read-sorted SAM files as a piped post-pass on DNA aligner output before it is compressed to BAM. In addition, it can simultaneously output into separate files the discordant read-pairs and/or split-read mappings used for structural variant calling. As an alignment post-pass, its own runtime overhead is negligible, while dramatically reducing overall pipeline complexity and runtime. As a stand-alone duplicate marking tool, it performs significantly better than PICARD or SAMBAMBA in terms of both speed and memory usage, while achieving nearly identical results. Availability and implementation: SAMBLASTER is open-source C++ code and freely available for download from https://github.com/GregoryFaust/samblaster. Contact: imh4y@virginia.edu

0 comments Cited 391 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Article: not found

Aligning Sequence Reads, Clone Sequences and Assembly Contigs with BWA-MEM

H. Li, Li, H Li (2013)

0 comments Cited 201 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Article: not found

xAtlas: Scalable small variant calling across heterogeneous next-generation sequencing experiments

Farek, J. Farek (2018)

0 comments Cited 2 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Prabhakaran Gangadharan: On behalf of : Alzheimer’s Disease Sequencing Project (ADSP)

John Hancock: Role: Associate Editor

Journal

Journal ID (nlm-ta): Bioinformatics

Journal ID (iso-abbrev): Bioinformatics

Journal ID (publisher-id): bioinformatics

Title: Bioinformatics

Publisher: Oxford University Press

ISSN (Print): 1367-4803

ISSN (Electronic): 1367-4811

Publication date (Print): 15 May 2019

Publication date (Electronic): 23 October 2018

Publication date PMC-release: 23 October 2018

Volume: 35

Issue: 10

Pages: 1768-1770

Affiliations

[1 ]Department of Pathology and Laboratory Medicine, Perelman School of Medicine at the University of Pennsylvania, Penn Neurodegeneration Genomics Center, Philadelphia, PA, USA

[2 ]Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA

Author notes

To whom correspondence should be addressed. Email: yyee@ 123456pennmedicine.upenn.edu or lswang@ 123456pennmedicine.upenn.edu

Article

Publisher ID: bty894

DOI: 10.1093/bioinformatics/bty894

PMC ID: 6513159

PubMed ID: 30351394

SO-VID: 1076fed5-8a40-403a-b5e5-14eb244293b0

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

History

Date received : 10 June 2018

Date revision received : 27 September 2018

Date accepted : 22 October 2018

Page count

Pages: 3

Funding

Funded by: National Institute on Aging 10.13039/100000049

Award ID: U54-AG052427

Award ID: U01-AG032984

Award ID: U24-AG041689

Comments

Comment on this article

scite_

Cited by 15

See all cited by

Most referenced authors 396

See all reference authors

VCPA: genomic variant calling pipeline and data management tool for Alzheimer’s Disease Sequencing Project

Read this article at

Abstract

Summary

Availability and implementation

Supplementary information

Related collections

REPO4EU WP2 Databases

Most cited references 4

SAMBLASTER: fast duplicate marking and structural variant read extraction

Aligning Sequence Reads, Clone Sequences and Assembly Contigs with BWA-MEM

xAtlas: Scalable small variant calling across heterogeneous next-generation sequencing experiments

Author and article information

Contributors

Journal

Affiliations

Author notes

Article

History

Page count

Funding

Categories

Comments

Comment on this article

Similar content 240

Cited by 15

Most referenced authors 396