The Need for a Human Pangenome Reference Sequence

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

The reference human genome sequence is inarguably the most important and widely used resource in the fields of human genetics and genomics. It has transformed the conduct of biomedical sciences and brought invaluable benefits to the understanding and improvement of human health. However, the commonly used reference sequence has profound limitations, because across much of its span, it represents the sequence of just one human haplotype. This single, monoploid reference structure presents a critical barrier to representing the broad genomic diversity in the human population. In this review, we discuss the modernization of the reference human genome sequence to a more complete reference of human genomic diversity, known as a human pangenome.

Related collections

Most cited references 144

Record: found
Abstract: found
Article: found

Is Open Access

A pneumonia outbreak associated with a new coronavirus of probable bat origin

Peng Zhou, Xing-Lou Yang, Xian-Guang Wang … (2020)

Since the outbreak of severe acute respiratory syndrome (SARS) 18 years ago, a large number of SARS-related coronaviruses (SARSr-CoVs) have been discovered in their natural reservoir host, bats 1–4 . Previous studies have shown that some bat SARSr-CoVs have the potential to infect humans 5–7 . Here we report the identification and characterization of a new coronavirus (2019-nCoV), which caused an epidemic of acute respiratory syndrome in humans in Wuhan, China. The epidemic, which started on 12 December 2019, had caused 2,794 laboratory-confirmed infections including 80 deaths by 26 January 2020. Full-length genome sequences were obtained from five patients at an early stage of the outbreak. The sequences are almost identical and share 79.6% sequence identity to SARS-CoV. Furthermore, we show that 2019-nCoV is 96% identical at the whole-genome level to a bat coronavirus. Pairwise protein sequence analysis of seven conserved non-structural proteins domains show that this virus belongs to the species of SARSr-CoV. In addition, 2019-nCoV virus isolated from the bronchoalveolar lavage fluid of a critically ill patient could be neutralized by sera from several patients. Notably, we confirmed that 2019-nCoV uses the same cell entry receptor—angiotensin converting enzyme II (ACE2)—as SARS-CoV.

0 comments Cited 10295 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

A global reference for human genetic variation

Lachlan Coin, Robert Garry, Oleksyk Taras (2018)

The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

0 comments Cited 4417 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype

Daehwan Kim, Joseph M Paggi, Chanhee Park … (2021)

Rapid advances in next-generation sequencing technologies have dramatically changed our ability to perform genome-scale analyses. The human reference genome used for most genomic analyses represents only a small number of individuals, limiting its usefulness for genotyping. We designed a novel method, HISAT2, for representing and searching an expanded model of the human reference genome, in which a large catalogue of known genomic variants and haplotypes is incorporated into the data structure used for searching and alignment. This strategy for representing a population of genomes, along with a fast and memory-efficient search algorithm, enables more detailed and accurate variant analyses than previous methods. We demonstrate two initial applications of HISAT2: HLA typing, a critical need in human organ transplantation, and DNA fingerprinting, widely used in forensics. These applications are part of HISAT-genotype, with performance not only surpassing earlier computational methods, but matching or exceeding the accuracy of laboratory-based assays.

0 comments Cited 3890 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Title: Annual Review of Genomics and Human Genetics

Abbreviated Title: Annu. Rev. Genom. Hum. Genet.

Publisher: Annual Reviews

ISSN (Print): 1527-8204

ISSN (Electronic): 1545-293X

Publication date Created: August 31 2021

Publication date (Print): August 31 2021

Volume: 22

Issue: 1

Pages: 81-102

Affiliations

[1 ]UC Santa Cruz Genomics Institute and Department of Biomedical Engineering, University of California, Santa Cruz, California 95064, USA;

[2 ]Department of Genetics, Edison Family Center for Genome Sciences and Systems Biology, and McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri 63110, USA;

Article

DOI: 10.1146/annurev-genom-120120-081921

PubMed ID: 33929893

SO-VID: 3c750d69-a82d-4f52-8bcb-cdcf3a707089

License:

http://creativecommons.org/licenses/by/4.0/

History

Data availability:

Comments

Comment on this article

scite_

Cited by 44

See all cited by

The Need for a Human Pangenome Reference Sequence

Read this article at

Abstract

Related collections

Radiology and Natural Language Processing

Most cited references 144

A pneumonia outbreak associated with a new coronavirus of probable bat origin

A global reference for human genetic variation

Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype

Author and article information

Journal

Affiliations

Article

History

Comments

Comment on this article

Similar content 538

Cited by 44