Whole-genome sequencing expands diagnostic utility and improves clinical management in paediatric medicine

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

The standard of care for first-tier clinical investigation of the aetiology of congenital malformations and neurodevelopmental disorders is chromosome microarray analysis (CMA) for copy-number variations (CNVs), often followed by gene(s)-specific sequencing searching for smaller insertion–deletions (indels) and single-nucleotide variant (SNV) mutations. Whole-genome sequencing (WGS) has the potential to capture all classes of genetic variation in one experiment; however, the diagnostic yield for mutation detection of WGS compared to CMA, and other tests, needs to be established. In a prospective study we utilised WGS and comprehensive medical annotation to assess 100 patients referred to a paediatric genetics service and compared the diagnostic yield versus standard genetic testing. WGS identified genetic variants meeting clinical diagnostic criteria in 34% of cases, representing a fourfold increase in diagnostic rate over CMA (8% ; P value=1.42E−05) alone and more than twofold increase in CMA plus targeted gene sequencing (13%; P value=0.0009). WGS identified all rare clinically significant CNVs that were detected by CMA. In 26 patients, WGS revealed indel and missense mutations presenting in a dominant (63%) or a recessive (37%) manner. We found four subjects with mutations in at least two genes associated with distinct genetic disorders, including two cases harbouring a pathogenic CNV and SNV. When considering medically actionable secondary findings in addition to primary WGS findings, 38% of patients would benefit from genetic counselling. Clinical implementation of WGS as a primary test will provide a higher diagnostic yield than conventional genetic testing and potentially reduce the time required to reach a genetic diagnosis.

Related collections

Most cited references 14

Record: found
Abstract: found
Article: not found

Comprehensive assessment of array-based platforms and calling algorithms for detection of copy number variants.

Dalila Pinto, Katayoon Darvishi, Xinghua Shi … (2011)

We have systematically compared copy number variant (CNV) detection on eleven microarrays to evaluate data quality and CNV calling, reproducibility, concordance across array platforms and laboratory sites, breakpoint accuracy and analysis tool variability. Different analytic tools applied to the same raw data typically yield CNV calls with <50% concordance. Moreover, reproducibility in replicate experiments is <70% for most platforms. Nevertheless, these findings should not preclude detection of large CNVs for clinical diagnostic purposes because large CNVs with poor reproducibility are found primarily in complex genomic regions and would typically be removed by standard clinical data curation. The striking differences between CNV calls from different platforms and analytic tools highlight the importance of careful assessment of experimental design in discovery and association studies and of strict data curation and filtering in diagnostics. The CNV resource presented here allows independent data evaluation and provides a means to benchmark new algorithms.

0 comments Cited 169 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

The Mammalian Phenotype Ontology as a tool for annotating, analyzing and comparing phenotypic information

Cynthia L Smith, Carroll-Ann Goldsmith, Janan Eppig (2004)

Background Mammalian phenotypes are complex and the term itself is imprecise. Generally, we use the word phenotype in referring to the appearance or manifestation of a set of traits in an individual that result from the combined action and interaction of genotype and environment. Because mouse is the premier model organism for the study of human biology and disease, the goal of comparative phenotyping and building new animal models through genetic engineering holds great promise. The mouse has distinct advantages for studies that translate to humans. It is a small, short-lived mammal with a fully sequenced genome in which all life stages can be accessed, and for which myriad tools are available for precisely experimentally manipulating its genome. Further, the large collection of inbred strains of mice and the controlled environment in which the animals live provides the ability to confirm phenotype observations and to systematically perturb environmental factors and genetic input to measure effects under defined conditions. Current international efforts to 'make a mutation' for every gene through mutagenesis [1] and genetic engineering [2,3] make it imperative for phenotype data to be represented in standard descriptive formats to enable computational analysis and comparison. Mammalian phenotypes are frequently genetically complex. Mutation of even a single gene almost always produces pleiotropic effects. Conversely, non-allelic mutations can produce indistinguishable phenotypes. Modifier genes and epistatic interactions can markedly alter the phenotype. Combining different allelic combinations of different genes can produce unique phenotypes not found in the single-gene mutation genotype. Imprinting of genes can dramatically affect phenotype. Mutations expressed in different inbred strains of mice can manifest as an increase or decrease of severity or penetrance of the corresponding phenotype. Quantitative trait loci (QTL) can contribute in complex nonlinear ways to the phenotype. In addition, mutations that are 'genomic' in nature, either disrupting or deleting multiple genes or occurring in intergenic regions, can produce distinct phenotypes and challenge us to think beyond gene effects to genomic effects. The outcome of these complex interactions can be dissected and reproducibly examined by characterizing inbred strains that represent the combined phenotype of the 'whole-genome' genotype in its environmental context. The Mouse Genome Database (MGD) at the Mouse Genome Informatics website [4,5] serves as the model organism database for mouse, representing the genetics, genomics and biology of the mouse and as a community resource for mammalian studies. Significant reorganization and modeling of phenotypes is now underway to support these data robustly, to represent phenotypes in ways that are computationally accessible, and to provide human interfaces to these data that will enable knowledge building and hypothesis generation. One component of this work is the development of the Mammalian Phenotype (MP) Ontology, a structured vocabulary that will aid in standardizing annotations and, with its concepts definitions, unambiguously describe phenotypic observations. Results and discussion The problems of text Written descriptions of phenotypes in higher organisms reflect the complexity of the subject, the richness of language, and the phenomenal diversity that these data represent. While text descriptions are commonly used in publications describing phenotype, and have been the basis of electronically accessible phenotypic descriptions (for example, Online Mendelian Inheritance in Man (OMIM) [6] and the Mouse Locus Catalog (MLC) [7], text is unreliable for searching, either manually or computationally. From the user's perspective, even the best full-text search including Boolean operators will miss appropriate records (false negatives) and return unwanted records (false positives). Consider the example in Table 1 where searches were done to find spontaneous mutations in which mice were entirely or partially devoid of hair/fur. To obtain a complete result, the user would need to use a number of search terms and synonyms. The wording within the text depends upon the author of the record and his/her particular word usage and editorial style. A minimum of four search terms is needed to recover the 27 relevant mutations displayed in this table and it cannot be ascertained if this is a complete set of mutations for this phenotype. Conversely, the user is returned with 23 irrelevant results. Irrelevant results can be returned for many reasons including, but not limited to, the following: the author of the record is contrasting the phenotype of a mutation in one gene with a mutation in another gene; the author is making a statement that includes the negation of the trait; the match is based on gene name rather than phenotype; the mutant was used as a linkage marker to genetically map another gene. A further detriment to database text records is their difficulty to update and maintain. As new information is learned about a phenotypic mutant, the record must be continually rewritten. Although this practice might be sustained for a small number of records, it does not scale when thousands of mutant records are considered. The alternative of simply adding on another paragraph to existing text records becomes confusing, with potentially conflicting information and different writing styles appearing in one textual description, and unwieldy, with more and more text that may no longer represent a logical synthesis. Nomenclatures, vocabularies and ontologies Formal nomenclatures for genes, mutant alleles and inbred strains of mice have existed since the 1940s [8,9]. The MGD [4] serves as the authoritative source for the names and symbols associated with mouse genes, alleles and strains. The advantage of applying such nomenclatures has been increasingly recognized as genomes become better defined and the realized power of comparative genomics allows homologous and orthologous gene relationships to be explicitly defined. At present, human, mouse and rat gene nomenclatures operate in parallel, using coordinated symbols for all three species' genes. In addition, mouse and rat strain nomenclatures were merged to one standard strain nomenclature recently, making strain identity and nomenclature conventions consistent. Nomenclature guides for mouse and rat genes, mutant alleles, and strains are available online and regularly revised based on international nomenclature committees' reviews [10]. Beyond nomenclatures, which are key to object identities and relationships, are vocabularies that can be used to describe broader concepts and categorizations. Vocabularies can take many forms, including simple lists of controlled terms, such as the cytogenetic band designations used to name the bands defined by chromosome staining or the classes of genetic markers, such as gene, pseudogene, expressed sequence tag (EST), and so forth. The annotation of complex biological data and concepts requires more than lists and simple vocabularies. Ontologies, or 'descriptions of what there is', contain both concepts, with precise meanings, and relationships among those concepts. As such ontologies are able to support descriptions of complex biology and are useful in making these data more amenable to computational analyses. The first widely used ontology developed and adopted in the biological domain is the Gene Ontology (GO) [11-13] which contains concepts of molecular function, cellular localization and biological process for annotating the functional aspects of genes. The GO is structured as a directed acyclic graph (DAG), where each vocabulary term (node) may have both multiple parent term and multiple child term relationships. MGD uses GO extensively for gene annotation [14]. In addition, MGD has adopted the Mouse Embryo Anatomy Nomenclature Database [15] and the Anatomical Dictionary for the Adult Mouse [16] for annotating data that include anatomical attributes, such as tissue sources for clones and phenotypes. The Gene Expression Database (GXD) [17], integrated with MGD through the Mouse Genome Informatics (MGI) system [4], applies these anatomical ontologies as a central concept in the description of expression data. Mammalian Phenotype Ontology Although the need for vocabularies as key components to consistent phenotype annotations for mammals has been recognized for some time [18], and many smaller controlled vocabularies have been implemented to describe various aspects of phenotype in MGD (for example, class of mutation, embryonic stem (ES) cell lines used for generating targeted mutations, type of inheritance), much of the data has remained in text form. Over the past two years, the Mammalian Phenotype (MP) Ontology has emerged to more precisely describe phenotypes, and to allow easier access to phenotype-sequence interactions. Our goal is to describe the richness of phenotypes as precisely as they are known, recognizing that phenotype data are by nature complex and usually incomplete. Taking advantage of structural properties of a DAG, we have the ability to annotate phenotypes to the level of data resolution available, whether general or very specific and the ability to query with a high-level term, returning all phenotypes containing annotations to that term or to terms more specific than the query term. Thus, one can query for 'respiratory signs/symptoms' and retrieve all phenotypes annotated to this term and its hierarchical 'children' (abnormal breathing, abnormal respiratory sounds, anoxia, apnea, dyspnea, hypercapnia, and so on), or specifically request annotations to any of these sub-terms. The top level terms of the MP Ontology include physiological systems, behavior, developmental phenotypes and survival/aging. Physiological systems branch into morphological and physiological phenotypes at the level immediately below. A browser to view the ontology is available at [19] (Figure 1). In this browser the DAG structure is flattened into a hierarchy, with multiple hierarchies representing unique paths to a term displayed sequentially. MP terms and synonyms can be searched or users can browse the ontology starting from the high-level terms and open levels continuously to increasingly granular terms. Each MP ontology term has a unique identifier, a definition and synonyms. In the term detail pages, these data and the number of hierarchical paths of the vocabulary where the terms appear are displayed. A plus sign following the term indicates that children of this term exist. In this figure, displayed next to the term, is a link indicating the number of annotation instances in MGD using this term or children of this term. This feature, due to be publicly available in early 2005, will greatly improve phenotype-centric searching in MGD. Developing the MP vocabulary To initiate the vocabulary, we first developed a high-level categorization of phenotypes consisting of approximately 100 terms, such as heart/cardiovascular dysmorphology and skeletal axial defects. As we used this list for annotations, terms were refined and general organizing principles for the MP vocabulary were developed. An important component of our approach has been to address two practical implementation questions. From the biologist's perspective, the question is what term would be used to describe a specific phenotypic trait. From the curation perspective, we ask what terms reflect biological reality and maximize curator productivity. From a purely ontological perspective, every trait could be broken down into a core object, such as 'cornea' or 'gastrulation', defined by anatomical, behavioral or physiological terms, and a series of attribute vocabularies that describe the quality, quantity and character of a trait. For the practical reason of needing robust terms to describe phenotypes up-front to speed curation and the problem of losing biological meaning, particularly for clinical or dysmorphology terms, when terms are completely deconstructed (that is, the sum of the parts is less than the term itself), we have chosen to use compound terms in the MP Ontology. A few examples of terms where it is difficult to preserve the full biological meaning once they are deconstructed are shown in Table 2. In addition, it should be noted that each of these terms requires multiple annotations to recover all aspects that the single term provides. Use of complex terms in the MP Ontology, however, does not preclude also storing the decomposed version should this later prove desirable (see PATO model discussed in [20]). More important, the MP Ontology can currently hold, for each term, database cross-references to other ontologies. This is a common practice in GO when compound terms are developed. For the MP Ontology, these cross-references include anatomical terms from the Mouse Anatomy ontologies [15,16] and the GO process terms [21]. Three major strategies are being pursued to further develop the vocabulary itself. First and most important is through the ongoing process of curating phenotype data. As new phenotypic traits are described and published, the need for new terms is recognized. New terms added in this way may be a simple addition to an existing hierarchical path or may result in the addition of entire new branches in the hierarchy. Second, collaborative efforts between the MGD phenotype curators, the mouse mutagenesis centers and the rat genetics community identify new specific terms and suggest improved organization of terms within particular hierarchical branches. Third, we are recruiting individuals with expertise in specific biological domains to review and evaluate sections of the vocabulary for accuracy, completeness and systematic arrangement. The MP Ontology is a work in progress and remains incomplete in some areas. We welcome the participation of the mammalian research community so that the most useful, definitive and universally applicable terms will be included. Information can be obtained by sending e-mail to pheno@informatics.jax.org. While common pathological and clinical terms are used in the MP Ontology, considerations for term placement within the structure and for precise terminology is often derived from comparison with other open biological ontologies (OBO) [22]. Recently, a cell-type ontology has become available [23] and a comparison of terminology to this ontology has not yet been completed. We are working with the mutant mouse pathology database Pathbase [24,25] to map and cross-reference terms from their Pathology Ontology. Vocabulary tools The MP Ontology was built as a DAG using the DAG-Edit software written by John Richter and Suzanna Lewis [26]. The MP Ontology is updated daily and can be browsed or searched online at [19]. MP files also are available in flat file format and OBO format from our ftp site [27] and are posted at the OBO site [28]. Phenotype data annotation Phenotypes are described in the MGD relative to the genotype of the individual. Genotype objects specifically consist of one or more allele pairs describing mutations or QTLs and the genetic background strain(s) where the phenotype was observed. Each phenotype annotation associates a MP Ontology term with a genotype/strain and the reference or data source supporting this assertion. Additional modifying text may be annotated to describe detail that is not easily standardized. Examples include experimental conditions, age of onset and incidence, and trait penetrance, among others. The annotation note may also include specifics of the phenotype where such details are deemed to be too case specific to be a MP term. In addition, genotypes are associated with OMIM where a particular mouse genotype is a model for human diseases and syndromes. Figure 2 shows the portion of one phenotype record that uses the MP Ontology. Conclusions The MP Ontology and annotation schema was designed to minimize curatorial time, yet remain precise enough to describe phenotypic data. It supports robust phenotypic annotations and querying capabilities for mouse phenotype data. While this vocabulary is far from complete, we have designed strategies for its continued development as a collaborative effort for supporting the representation of existing mutations and those that continue to be created. As of 1 November 2004, over 11,150 phenotypic alleles representing mutations in 5,214 unique genes had been catalogued in MGD. For these alleles, 9,696 genotype records exist, with 21,556 phenotypic annotation instances. The MP Ontology is also used in phenotypic data annotations at the RGD [29]. As our database groups continue to accumulate annotations, it will be possible to mine these data to ask interesting questions about similarities and differences in comparable allele effects between the species, as well as within species. Comparative phenotype data will potentially uncover new modifier effects and point to new pathway relationships and genetic networks tied to disease processes. The MP Ontology will be critical for enabling computational analyses and providing a framework for improved web views and other human-comprehensible displays for the research community.

0 comments Cited 167 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Recessive mutations in a distal PTF1A enhancer cause isolated pancreatic agenesis

Michael N. Weedon, Inês Cebola, Ann-Marie Patch … (2013)

The contribution of cis-regulatory mutations to human disease remains poorly understood. Whole genome sequencing can identify all non-coding variants, yet discrimination of causal regulatory mutations represents a formidable challenge. We used epigenomic annotation in hESC-derived embryonic pancreatic progenitor cells to guide the interpretation of whole genome sequences from patients with isolated pancreatic agenesis. This uncovered six different recessive mutations in a previously uncharacterized ~400bp sequence located 25kb downstream of PTF1A (pancreas-specific transcription factor 1a) in ten families with pancreatic agenesis. We show that this region acts as a developmental enhancer of PTF1A and that the mutations abolish enhancer activity. These mutations are the most common cause of isolated pancreatic agenesis. Integrating genome sequencing and epigenomic annotation in a disease-relevant cell type can uncover novel non-coding elements underlying human development and disease.

0 comments Cited 114 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): NPJ Genom Med

Journal ID (iso-abbrev): NPJ Genom Med

Title: NPJ Genomic Medicine

Publisher: Nature Publishing Group

ISSN (Electronic): 2056-7944

Publication date (Electronic): 13 January 2016

Publication date Collection: 2016

Volume: 1

Page: 15012

Affiliations

[1 ]Genome Diagnostics, Department of Paediatric Laboratory Medicine, The Hospital for Sick Children , Toronto, ON, Canada

[2 ]Department of Laboratory Medicine and Pathobiology, University of Toronto , Toronto, ON, Canada

[3 ]The Centre for Applied Genomics, The Hospital for Sick Children , Toronto, ON, Canada

[4 ]Genetics and Genome Biology Program, The Hospital for Sick Children , Toronto, ON, Canada

[5 ]Division of Clinical and Metabolic Genetics, The Hospital for Sick Children , Toronto, ON, Canada

[6 ]Centre for Genetic Medicine, The Hospital for Sick Children , Toronto, ON, Canada

[7 ]Department of Paediatrics, University of Toronto , Toronto, ON, Canada

[8 ]Joint Centre for Bioethics, University of Toronto , Toronto, ON, Canada

[9 ]Department of Family and Community Medicine and Clinical Public Health Division, University of Toronto , Toronto, ON, Canada

[10 ]Child Health Evaluative Sciences, The Hospital for Sick Children , Toronto, Ontario, Canada

[11 ]Department of Bioethics, The Hospital for Sick Children , Toronto, ON, Canada

[12 ]Centre for Computational Medicine, The Hospital for Sick Children , Toronto, ON, Canada

[13 ]Department of Computer Science, University of Toronto , Toronto, ON, Canada

[14 ]Department of Electrical and Computer Engineering and Donnelly Centre for Cellular and Biomolecular Research, University of Toronto , Toronto, Ontario, Canada

[15 ]Department of Molecular Genetics, University of Toronto , Toronto, Ontario, Canada

[16 ]Prenatal Diagnosis and Medical Genetics Program, Mount Sinai Hospital , Toronto, ON, Canada

[17 ]Department of Paediatric Laboratory Medicine, The Hospital for Sick Children , Toronto, ON, Canada

[18 ]Department of Ophthalmology and Vision Sciences, The Hospital for Sick Children , Toronto, ON, Canada

[19 ]Complete Genomics Inc , Mountain View, CA, USA

Author notes

[* ] ( crm@ 123456sickkids.ca )

[20]

These authors contributed equally to this work.

CRM, DJS, MSM, PNR designed the study, with contributions from SB, RDC and SWS. CRM and DJS drafted the manuscript. MSM, RJ, RDC, PNR, SWS contributed to preparation of the manuscript. CRM, DJS, RJ, MSM, SB, NM were responsible for variant classification and interpretation. DM, CRM, DJS, RKCY created and developed the variant analysis pipeline. NM and CC were responsible for study coordination. BT, TN, GP, BF, BA contributed to sequence data analysis. MS, RZS, MSM were responsible for consent design, with contributions from SB, RH and CS. RH contributed to test costing and data collection. MB and MG contributed to phenotype data collection and analysis. RE, RAL, RK provided whole-genome sequencing and analysis. MSM, SB, RJ, NM, RDC, SA, RBH, RBP, MTC, LC, AC, DC, SJD, CC, LD, RE, LF, AG, BH, MH, SH, MIF, PK, NK, RK, JK, EL, HM, SMM, RML, EN, GN, NP, NQ, JR, MR, AS, AS, CS, PS, KS, RW, GY were responsible for patient recruitment, clinical phenotyping and participation in clinical review meetings. CRM and DJS had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.

Article

Publisher Item ID: npjgenmed201512

DOI: 10.1038/npjgenmed.2015.12

PMC ID: 5447450

PubMed ID: 28567303

SO-VID: efbff9b3-d808-4dd1-a1bb-626fe7a0ab15

License:

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

History

Date received : 23 November 2015

Date revision received : 26 November 2015

Date accepted : 26 November 2015

Comments

Comment on this article

scite_

Cited by 122

See all cited by

- Version 1

Whole-genome sequencing expands diagnostic utility and improves clinical management in paediatric medicine

Read this article at

Abstract

Related collections

ScienceOpen Clinical Case Reports Collection

Most cited references 14

Comprehensive assessment of array-based platforms and calling algorithms for detection of copy number variants.

The Mammalian Phenotype Ontology as a tool for annotating, analyzing and comparing phenotypic information

Recessive mutations in a distal PTF1A enhancer cause isolated pancreatic agenesis

Author and article information

Journal

Affiliations

Author notes

Article

History

Categories

Comments

Comment on this article

Similar content 68

Cited by 122