11
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Sequential targeted exome sequencing of 1001 patients affected by unexplained limb-girdle weakness

      research-article
      , PhD 1 , , PhD 1 , 67 , , MSc 1 , , BSc 1 , , BSc 2 , 3 , , MSc 2 , 3 , , MSc 2 , 3 , , PhD 2 , 3 , , BA 2 , 3 , , MD, MPhil 2 , 3 , , MD 1 , 68 , , PhD 1 , , MSc 1 , , PhD 1 , , MD 1 , , MD 1 , , PhD 2 , 3 , 69 , , PhD 1 , 70 , , PhD 2 , 3 , 71 , 72 , The MYO-SEQ consortium, , MD 1 ,
      Genetics in Medicine
      Nature Publishing Group US
      next-generation sequencing, targeted exome analysis, neuromuscular disease, limb-girdle weakness, genetic diagnosis

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Purpose

          Several hundred genetic muscle diseases have been described, all of which are rare. Their clinical and genetic heterogeneity means that a genetic diagnosis is challenging. We established an international consortium, MYO-SEQ, to aid the work-ups of muscle disease patients and to better understand disease etiology.

          Methods

          Exome sequencing was applied to 1001 undiagnosed patients recruited from more than 40 neuromuscular disease referral centers; standardized phenotypic information was collected for each patient. Exomes were examined for variants in 429 genes associated with muscle conditions.

          Results

          We identified suspected pathogenic variants in 52% of patients across 87 genes. We detected 401 novel variants, 116 of which were recurrent. Variants in CAPN3, DYSF, ANO5, DMD, RYR1, TTN, COL6A2, and SGCA collectively accounted for over half of the solved cases; while variants in newer disease genes, such as BVES and POGLUT1, were also found. The remaining well-characterized unsolved patients (48%) need further investigation.

          Conclusion

          Using our unique infrastructure, we developed a pathway to expedite muscle disease diagnoses. Our data suggest that exome sequencing should be used for pathogenic variant detection in patients with suspected genetic muscle diseases, focusing first on the most common disease genes described here, and subsequently in rarer and newly characterized disease genes.

          Related collections

          Most cited references32

          • Record: found
          • Abstract: found
          • Article: found

          Improving genetic diagnosis in Mendelian disease with transcriptome sequencing

          Exome and whole-genome sequencing are becoming increasingly routine approaches in Mendelian disease diagnosis. Despite their success, the current diagnostic rate for genomic analyses across a variety of rare diseases is approximately 25 to 50%. We explore the utility of transcriptome sequencing [RNA sequencing (RNA-seq)] as a complementary diagnostic tool in a cohort of 50 patients with genetically undiagnosed rare muscle disorders. We describe an integrated approach to analyze patient muscle RNA-seq, leveraging an analysis framework focused on the detection of transcript-level changes that are unique to the patient compared to more than 180 control skeletal muscle samples. We demonstrate the power of RNA-seq to validate candidate splice-disrupting mutations and to identify splice-altering variants in both exonic and deep intronic regions, yielding an overall diagnosis rate of 35%. We also report the discovery of a highly recurrent de novo intronic mutation in COL6A1 that results in a dominantly acting splice-gain event, disrupting the critical glycine repeat motif of the triple helical domain. We identify this pathogenic variant in a total of 27 genetically unsolved patients in an external collagen VI–like dystrophy cohort, thus explaining approximately 25% of patients clinically suggestive of having collagen VI dystrophy in whom prior genetic analysis is negative. Overall, this study represents a large systematic application of transcriptome sequencing to rare disease diagnosis and highlights its utility for the detection and interpretation of variants missed by current standard diagnostic approaches.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Diagnosis and management of Duchenne muscular dystrophy, part 1: diagnosis, and pharmacological and psychosocial management.

            Duchenne muscular dystrophy (DMD) is a severe, progressive disease that affects 1 in 3600-6000 live male births. Although guidelines are available for various aspects of DMD, comprehensive clinical care recommendations do not exist. The US Centers for Disease Control and Prevention selected 84 clinicians to develop care recommendations using the RAND Corporation-University of California Los Angeles Appropriateness Method. The DMD Care Considerations Working Group evaluated assessments and interventions used in the management of diagnostics, gastroenterology and nutrition, rehabilitation, and neuromuscular, psychosocial, cardiovascular, respiratory, orthopaedic, and surgical aspects of DMD. These recommendations, presented in two parts, are intended for the wide range of practitioners who care for individuals with DMD. They provide a framework for recognising the multisystem primary manifestations and secondary complications of DMD and for providing coordinated multidisciplinary care. In part 1 of this Review, we describe the methods used to generate the recommendations, and the overall perspective on care, pharmacological treatment, and psychosocial management. Copyright 2010 Elsevier Ltd. All rights reserved.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Towards a comprehensive structural variation map of an individual human genome

              Background Comprehensive catalogues of genetic variation are crucial for genotype and phenotype correlation studies [1-8], in particular when rare or multiple genetic variants underlie traits or disease susceptibility [9,10]. Since 2007, several personal genomes have been sequenced, capturing different extents of their genetic variation content (Additional file 1) [1-8,11]. In the first publication (J Craig Venter's DNA named HuRef) [1], variants were identified based on a comparison of the Venter assembly to the National Center for Biotechnology Information (NCBI) reference genome (build 36). In total, 3,213,401 SNPs and 796,167 structural variants (SVs; here SV encompasses all non-SNP variation) were identified in that study. Similar numbers of SNPs, but significantly less SVs (ranging from approximately 137,000 to approximately 400,000) are reported in other individual genome sequencing projects [2-4,6-8,11]. It is clear that even with deep sequence coverage, annotation of structural variation remains very challenging, and the full extent of SV in the human genome is still unknown. Microarrays [12-14] and sequencing [15-18] have revealed that SV contributes significantly to the complement of human variation, often having unique population [19] and disease [20] characteristics. Despite this, there is limited overlap in independent studies of the same DNA source [21,22], indicating that each platform detects only a fraction of the existing variation, and that many SVs remain to be found. In a recent study using high-resolution comparative genomic hybridization arrays, the authors found that approximately 0.7% of the genome was variable in copy number in each hybridization of two samples [19]. Yet, these experiments were limited to the detection of unbalanced variation larger than 500 bp, and the total amount of variation between two genomes would therefore be expected to exceed 0.7%. Our objective in the present study was to annotate the full spectrum of genetic variation in a single genome. We used the previously sequenced Venter genome due to the availability of DNA and full access to genome sequence data. The assembly comparison method presented in the initial sequencing of this genome [1] discovered an unprecedented number of SVs in a single genome; however, the approach relied on an adequate diploid assembly. As there are known limitations in assembling alternative alleles for SV [1], we expected that there was still a significant amount of variation to be found. In an attempt to capture the full spectrum of variation in a human genome, this current study uses multiple sequencing- and microarray-based strategies to complement the results of the assembly comparison approach in the Levy et al. [1] study. First, we detect genetic variation from the original Sanger sequence reads by direct alignment to NCBI build 36 assembly, bypassing the assembly step. Furthermore, using custom high density microarrays, we probe the Venter genome to identify variants in regions where sequencing-based approaches may have difficulties (Figure 1). We discover thousands of new SVs, but also find biases in each method's ability to detect variants. Our collective data reveal a continuous size distribution of genetic variants (Figure 2a) with approximately 1.58% of the Venter haploid genome encompassed by SVs (39,520,431 bp or 1.28% as unbalanced SVs and 9,257,035 bp or 0.30% as inversions) and 0.1% as SNPs (Table 1, Figure 2). While there is still room for improvement, our results give the best estimate to date of the variation content in a human genome, provide an important resource of SVs for other personal genome studies, and highlight the importance of using multiple strategies for SV discovery. Figure 1 Overall workflow of the current study. Two distinct technologies were used to identify SV in the Venter genome: whole genome sequencing and genomic microarrays. The sequencing experiments, the construction of the Venter genome assembly, and the assembly comparison with NCBI build 36 (B36) reference had been completed in previous studies [1,16,39]. Hence, these experiments are shown as blue boxes. The scope of the current study is denoted in orange boxes. We re-analyzed the initial sequencing data, and searched for SVs in sequence alignments by the mate-pair and split-read approaches. We also used three distinct comparative genomic hybridization (CGH) array platforms: Agilent 24 M, NimbleGen 42 M and Agilent 244 K. Unlike the other array platforms, which were designed based on the B36 assembly, the Agilent 244 K targeted scaffold segments unique to the Celera/Venter assembly. To denote this, Figure 1 shows a dotted line connecting between the assembly comparison outcome and the Agilent 244 K box. Finally, the Affymetrix 6.0 and Illumina 1 M SNP arrays were also used in the present study. Figure 2 Size distribution of genetic variants. (a) A non-redundant size spectrum of SNP and CNV (including indels) and a breakdown of the proportion of gain to loss. The indel/CNV dataset consists of variants detected by assembly comparison, mate-pair, split-read, NimbleGen 42 M comparative genomic hybridization (CGH) and Agilent 24 M. The results show that the number and the size of variants are negatively correlated. Although the proportions of gains and losses are quite equal across the size spectrum, there are some deviations. Losses are more abundant in the 1 to 10 kb range, and this is mainly due to the inability of the 2-kb and 10-kb library mate-pair clones to detect insertions larger than their clone size. The opposite is seen for large events, where duplications are more common than deletions, which may be due to both biological and methodological biases. The increase in the number of events near 300 bp and 6 kb can be explained by short interspersed nuclear element (SINE) and long interspersed nuclear element (LINE) indels, respectively. The general peak around 10 kb corresponds to the interval with the highest clone coverage. (b) Size distribution of gains (insertions and duplications) highlighting the detection range of each methodology. The split-read method is designed to capture insertions from 11 bp to the size of a Sanger-based sequence read (approximately 1 kb). There is no insertion detected in the size range between the 2 kb and 10 kb library using the mate-pair approach. Furthermore, due to technical limitations, large gains (≥ 100,000 bp) cannot be identified with the sequencing-based approaches, while these are readily identified by microarrays. (c) Size distribution of deletions. Table 1 Structural variants detected by different methods Method Type Number Minimum size (bp) Median size (bp) Maximum size (bp) Total size (bp) Assembly comparison a Homo. insertion 275,512 1 2 82,711 3,117,039 Homo. deletion 283,961 1 2 18,484 2,820,823 Hetero. insertion 136,792 1 1 321 336,374 Hetero. deletion 99,814 1 1 349 250,300 Inversion 88 102 1,602 686,721 1,627,871 Mate-pair Insertion 780 346 3,588 28,344 3,880,544 Deletion 1,494 340 3,611 1,669,696 10,531,345 Inversion 105 368 3,121 2,026,495 8,068,541 Split-read Insertion 8,511 11 16 414 224,022 Deletion 11,659 11 18 111,714 1,764,522 Agilent 24 M Duplication 194 445 1,274 113,465 1,065,617 Deletion 319 439 1,198 852,404 2,779,880 NimbleGen 42 M Duplication 366 448 4,665 836,362 11,292,451 Deletion 358 459 2,460 359,736 3,861,282 Affymetrix 6.0 Duplication 17 8,638 42,798 640,474 2,011,557 Deletion 21 2,280 13,145 856,671 1,978,028 Illumina 1 M Duplication 3 11,539 22,148 87,670 121,357 Deletion 9 8,576 32,199 145,662 431,131 Custom Agilent 244 k Duplication 44 219 1,356 8,737 98,529 Deletion 7 170 332 2,258 4,130 Non-redundant totalb Insertion/duplication 417,206 1 1 836,362 19,981,062 Deletion 390,973 1 2 1,669,696 19,539,369 Inversion 167 102 1,249 2,026,495 9,257,035 aWe used an italicized font to distinguish the results from the Levy et al. [1] study. Moreover, from that previous study, we included all homozygous indels, heterozygous indels, indels embedded within simple, bi-allelic, and non-ambiguously mapped heterozygous mixed sequence variants, and only those inversions whose size is at most 3 Mb. bComplete data are presented in Additional files 19, 20 and 21. Non-redundant variation size distribution is presented in Figure 2a. Results Several different analytical and experimental strategies were employed to exhaustively analyze the Venter genome for SV. An overview of the different analyses performed is shown in Figure 1. Sequencing-based variation We first used computational strategies to extract additional SV information from the existing Sanger-based sequencing data generated as paired-end (or mate-pair) reads from clone libraries of defined size [1]. First, we adopted a paired-end mapping approach [15,17,18] and aligned 11,346,790 mate-pairs from libraries with expected clone sizes of 2, 10 or 37 kb (Additional file 2) to the NCBI build 36 assembly. We found that 97.3% of mate-pairs had the expected mapping distance and orientation. Mate-pairs discordant in orientation or mapping distance were used to identify variants, and we required each event to be supported by at least two clones. In total, this strategy was used to identify 780 insertions, 1,494 deletions and 105 inversions (Figure 1; Table 1; Additional file 3). In an independent analysis of the same underlying sequencing data, we then captured SVs by examining the alignment profiles of 31,546,016 paired and unpaired reads to search for intra-alignment gaps [23]. The presence of an intra-alignment gap in the sequence read (query sequence) or in the reference genome (target sequence) would indicate a putative insertion or deletion event, respectively. The identification of such a 'split-read' alignment signature complements the mate-pair approach, as significantly smaller insertions and deletions can be discovered. We required at least two overlapping split-reads having an alignment gap >10 bp to call a variant. A total of 8,511 insertions and 11,659 deletions ranging from 11 to 111,714 bp in size were identified (Figure 1; Table 1; Additional file 4). Array based variation We used two ultra-high density custom comparative genomic hybridization (CGH) array sets and two commonly used SNP genotyping arrays to identify relative gains and losses. A significant amount of variation was detected from the two custom CGH arrays: an Agilent oligonucleotide array set with 24 million features (Agilent 24 M) [7], and a NimbleGen oligonucleotide array set containing 42 million features (NimbleGen 42 M) [19]. The Agilent platform identified 194 duplications and 319 deletions, while the NimbleGen array set detected 366 gains and 358 losses, ranging in size from 439 bp to 852 kb, in Venter (Figure 1; Table 1; Additional files 5 and 6). Furthermore, we scanned the Venter genome using Affymetrix SNP Array 6.0 and Illumina BeadChip 1 M, and the results are summarized in Table 1 plus Additional files 7 and 8. Most microarrays used for CNV analyses are designed based on the NCBI assemblies. Therefore, any region where the reference exhibits the deletion allele of an indel, or sequences mapping to gaps in the assembly, will not be targeted. In previous studies [16,24], many unknown DNA segments were identified to have no or poor alignment to the NCBI reference when compared to the Celera R27C assembly. To capture genetic variation in such potentially novel sequences, we designed a custom Agilent 244 K array to target those scaffold sequences at least 500 bp in length. We then performed CGH on seven HapMap individuals and detected 231 regions (101 gains and 130 losses) in 161 scaffolds to be variable (Additional file 9). Of these, we found 44 gains and 7 losses in 36 Celera scaffolds were specific to Venter (Figure 1, Table 1). Using paired-end mapping, as well as cross-species genome comparison with the chimpanzee, we were able to find a placement in NCBI build 36 for 25 of 36 scaffolds that were copy number variable in Venter. Two of the scaffolds were mapped to regions containing assembly gaps, 15 of 25 anchored scaffolds corresponded to insertion events also detected elsewhere [15,18], and the remaining eight represent new insertion findings (Additional file 10). Validation of findings We used several computational and experimental approaches to validate our SV findings. We performed experimental validation by PCR amplification and gel-sizing and confirmed 89 of the 96 (93%) SVs predicted by sequence analysis (Additional files 11 and 12). Using quantitative real-time PCR (qPCR), we validated 20 of 25 (80.0000%) CNVs detected by microarrays, and most of these CNVs were from the custom Agilent 244 K array covering sequences not in the NCBI assembly (Additional file 13). Inversion predictions were tested by fluorescence in situ hybridization (FISH) [25]. In one such finding, a predicted 1.1-Mb inversion at 16p12 was identified to be homozygous in Venter and in all of the seven additional HapMap samples from four populations tested, suggesting that the reference at this locus represents a rare allele, or is incorrectly assembled (Additional file 14). We then compared the SVs identified here with the previous assembly comparison-based analysis of the same genome [1], and found that 11,140 variants were in common. We noticed that our multi-platform method excelled in calling large variants. In fact, even after excluding all of the small variants (≤ 10 bp) from the previous Levy at al. study [1], we still observed that the current study tended to find larger SVs (a current average of 1,909.3 bp now versus a previous average of 113.4 bp). Additional file 15 shows that the sensitivity of assembly comparison dropped as size increased to over 1 kb, and the proportion of larger SVs significantly increased as a result of the present study (Figure 2b, c). Finally, we determined the number of calls in this study that were either verified by another platform in this study or found in the Database of Genomic Variants [12]. In total, we computationally confirmed 15,642 (65.6%) of our current calls: 6,301 were gains; 9,726 were losses; and 65 were inversions. Cross-platform comparison We performed an in-depth analysis of the characteristics of the variants detected by each of the methods. First, by contrasting against a population-based study [19], we observed highly similar size estimates for the same underlying SVs between methods (Figure 3). With sufficient genome coverage of clones with accurate and tight insert size, the mate-pair method yields precise variation size. Similarly, the split-read approach gives nucleotide resolution breakpoints, while the high-density CGH and SNP arrays have dense probe coverage to accurately identify the start and end points of SVs. Overall, our multiple approaches are highly robust in estimating variant size. Figure 3 Agreement between the non-redundant set of Venter CNVs and genotype-validated variable loci. The agreement between sites identified by different detection methods was measured by the percentage of reciprocal overlap between the estimated size for the non-redundant set of Venter variants and the estimated size for the CNVs generated and genotyped in the Genome Structural Variation (GSV) population genetics study [19]. Two sites were considered overlapping if the reciprocal overlap among their estimated sizes was ≥ 50%. The lower right corner plot summarizes the mean discrepancy between Venter and GSV loci sizes, as a proportion of the GSV-estimated CNV size. Next, we compared the variants discovered by the two whole genome CGH array sets, NimbleGen 42 M and Agilent 24 M, and investigated the primary reason for the discordance between the two data sets. Not surprisingly, a substantial portion of the discordant calls can be explained by the difference in probe coverage. In fact, approximately 70% of the unique calls on the NimbleGen 42 M array had inadequate probe coverage on the Agilent 24 M array to be able to call variants, and approximately 30% vice versa (Additional file 16). After that, we compared the number of calls uniquely identified by the SNP-genotyping microarrays, and we identified 12 and 0 novel SVs contributed by Affymetrix 6.0 and Illumina 1 M, respectively. Of the 12 new Affymetrix calls, 9 are located in complex regions containing blocks of segmental duplications. Subsequently, when looking for enrichment of genomic features among variants detected by different approaches, we found that there was a significant enrichment (P 1 kb, and Additional files 19, 20 and 21 for a complete list of variation in the Venter genome. Comparison with other personal genomes When we compared the complete set of Venter's SVs with those from other published genomes [2-4,6-8] (Additional file 1), we found that 209,493/808,345 (25.9%) of the Venter variants overlapped variants described in one or more of the other six studies. Upon examining the size distribution of variants from different studies, particularly the size of insertions and duplications, we realized that studies based primarily on next generation sequencing (NGS) data for variation calling were unable to identify calls in certain size ranges (Figure 4). These results further signify that, at present, multiple approaches are needed to capture SVs across the entire size spectrum. The most obvious limitation is that short next generation sequencing NGS reads/inserts fail to capture insertion events greater than the size of the reads/inserts. Figure 4 Difference in the size distributions of reported indels/CNVs in published personal genome sequencing studies. The graphs show variation found in a few personal genome sequencing studies [1-4,6-8]. These diagrams indicate that multiple approaches are needed for better detection of CNVs. Here, the total variant set in the Venter genome found in both the Levy et al. [1] and the current study is displayed. Unlike the current study where the size of mate-pair indels is equal to the difference between the mapping distance and the expected insert size, the SVs in the Ahn et al. [6] study are only based on the mapping distance. Besides the NGS data, we have also included the variants detected by the high density Agilent 24 M data in the Kim et al. [7] study. In Wheeler et al. [2], insertions identified by intra-read alignment would be limited by the size of the sequencing read; hence, large insertions beyond the read length were not detected. Wang et al. [4], Kim et al., and McKernan et al. [8] detected small variants based on split-reads and large ones based on mate-pairs and microarrays, but failed to detect variation between these size ranges. Also, see Additional file 1. (a) Insertion and duplication size distribution. (b) Deletion size distribution. Functional importance of structural variation Next, we analyzed the complete set of SVs in Venter for overlap with features of the genome with known functional significance, which might influence health outcomes (Table 2). We found 189 genes to be completely encompassed by gains or losses, 4,867 non-redundant genes (3,126 impacted by gains and 3,025 by losses) whose exons were impacted, and 573 of these to be in the Online Mendelian Inheritance in Man (OMIM) Disease database (Additional files 22, 23, 24, 25 and 26). However, there was an overall paucity of SV (P ≥ 0.999) overlapping exonic sequences of genes associated with autosomal dominant/recessive diseases, cancer disease, and imprinted and dosage-sensitive genes. In general, there is an absence of variation in both exonic and regulatory sequences, such as enhancers, promoters and CpG islands, in the genome of this individual. Table 2 Genomic landscape and structural variants in the Venter genome* Total non-redundant gainsb Total non-redundant lossesc Genomic feature (number of entries)a Number of (%) genomic features Number of (%) structural variants P-values Number of (%) genomic features Number of (%) structural variants P-values RefSeq gene locid (20,174) 14,268 (70.72%) 159,250 (38.17%) 0.000 13,951 (69.15%) 149,568 (38.26%) 0.000 RefSeq gene entire transcript locie (20,174) 101 (0.50%) 41 (0.01%) 0.000 91 (0.45%) 47 (0.01%) 0.000 RefSeq gene exonsf (20,174) 3,126 (15.50%) 3,890 (0.93%) 0.999 3,025 (14.99%) 3,723 (0.95%) 0.999 Enhancer elements (837) 80 (9.56%) 85 (0.02%) 0.999 84 (10.04%) 93 (0.02%) 0.999 Promoters (20,174) 2,007 (9.95%) 2,071 (0.50%) 0.999 1,812 (8.98%) 1,922 (0.49%) 0.999 Stop codonsg (30,885) 225 (0.73%) 99 (0.02%) 0.000 272 (0.88%) 134 (0.03%) 0.563 OMIM disease gene loci (3,737) 1,658 (44.37%) 20,589 (4.93%) 0.000 1,664 (44.53%) 19,396 (4.96%) 0.000 OMIM disease gene exons (3,737) 367 (9.82%) 458 (0.11%) 0.999 383 (10.25%) 492 (0.13%) 0.999 Autosomal dominant gene loci (316) 247 (78.16%) 2,773 (0.66%) 0.023 245 (77.53%) 2,593 (0.66%) 0.031 Autosomal dominant gene exons (316) 60 (18.99%) 70 (0.02%) 0.999 64 (20.25%) 78 (0.02%) 0.999 Autosomal recessive gene loci (472) 386 (81.78%) 3,931 (0.94%) 0.065 402 (85.17%) 3,749 (0.96%) 0.009 Autosomal recessive gene exons (472) 58 (12.29%) 78 (0.02%) 0.999 86 (18.22%) 109 (0.03%) 0.999 Cancer disease gene loci (363) 301 (82.92%) 4,202 (1.01%) 0.651 307 (84.57%) 3,899 (1.00%) 0.821 Cancer disease gene exons (363) 66 (18.18%) 85 (0.02%) 0.999 71 (19.56%) 98 (0.03%) 0.999 Dosage sensitive gene loci (145) 120 (82.76%) 2,995 (0.72%) 0.604 125 (86.21%) 2,794 (0.71%) 0.728 Dosage sensitive gene exons (145) 39 (26.90%) 51 (0.01%) 0.999 41 (28.28%) 58 (0.01%) 0.999 Genomic disorders (52) 50 (96.15%) 14,178 (3.40%) 0.999 51 (98.08%) 13,373 (3.42%) 0.996 Pharmacogenetic gene loci (186) 97 (52.15%) 853 (0.20%) 0.517 96 (51.61%) 838 (0.21%) 0.105 Pharmacogenetic gene exons (186) 21 (11.29%) 27 (0.01%) 0.998 23 (12.37%) 29 (0.01%) 0.984 Imprinted gene loci (59) 39 (66.10%) 405 (0.10%) 0.989 37 (62.71%) 378 (0.10%) 0.982 Imprinted gene exons (59) 13 (22.03%) 15 (0.00%) 0.998 11 (18.64%) 13 (0.00%) 0.999 MicroRNAs (685) 8 (1.17%) 9 (0.00%) 0.785 11 (1.61%) 9 (0.00%) 0.836 GWAS loci (419) 415 (99.05%) 9,413 (2.26%) 0.000 416 (99.28%) 8,852 (2.26%) 0.000 GWAS SNPs (419) 1 (0.24%) 1 (0.00%) 0.786 2 (0.48%) 2 (0.00%) 0.810 CpG islands (14,867) 287 (1.93%) 1,516 (0.36%) 0.999 299 (2.01%) 1,508 (0.39%) 0.999 DNAseI hypersensitivity sites (95,709) 6,524 (6.82%) 7,165 (1.72%) 0.999 6,392 (6.68%) 6,914 (1.77%) 0.999 Recombination hotspots (32,996) 16,839 (51.03%) 30,315 (7.27%) 0.000 16,211 (49.13%) 28,407 (7.27%) 0.000 Segmental duplications (51,809) 17,172 (33.14%) 13,864 (3.32%) 0.999 16,518 (31.88%) 13,177 (3.37%) 0.999 Ultra-conserved elements (481) 2 (0.42%) 2 (0.00%) 0.999 2 (0.42%) 2 (0.00%) 0.999 Affy 6.0 SNPsh (907,691) 1,556 (0.17%) 389 (0.09%) 0.999 3,022 (0.33%) 934 (0.24%) 0.999 Illumina 1 M SNPsi (1,048,762) 2,318 (0.22%) 601 (0.14%) 0.999 4,789 (0.46%) 1,536 (0.39%) 0.999 *This table shows how structural variation affects different functional annotations and sequence characteristics in the Venter genome. The leftmost column shows the names and total number of genomic features. The rest of the table is divided between gains and losses. Within the gain category, the first left column shows the number of (and percentage of total) genomic features impacted, and the second column shows the corresponding number of (and percentage of total) gain variants, and the last column shows the significance of the overlap as determined by simulations. An identical format is used for the losses. aSee Additional file 17 for a list of data sources. bBased on a non-redundant list of 417,206 gains and insertions detected in this and the Levy et al. [1] study of the Venter genome. cBased on a non-redundant list of 390,973 deletions detected in this and the Levy et al. [1] study of the Venter genome. dGenes where a structural variant resides anywhere within the transcript (exonic and intronic). eGenes from the RefSeq data set where the entire transcript locus is encompassed by the structural variant. fGenes from the RefSeq data set where exonic sequence is impacted by the structural variant. The non-redundant number of genes altered in some way by duplications and deletions is 4,867. gStructural variants that overlap/impact a stop codon from the RefSeq gene set. hProbes on the Affymetrix 6.0 Commercial array. iProbes on the Illumina 1 M array. GWAS, genome-wide association studies; OMIM, Online Mendelian Inheritance in Man. Currently, direct-to-consumer testing companies and genome-wide association studies mainly use microarray-based SNP data [26,27], but SVs are typically not considered. Venter indels/CNVs, however, overlap with 4,565 and 7,047 of SNPs on the Affymetrix SNP-Array 6.0 and Illumina-BeadChip 1 M products (two commonly used arrays) potentially impacting genotype calling, most notably when deletions are involved. Moreover, our attempts to impute SV calls using tagging-SNPs captured 308 of 405 (76.0%) Venter bi-allelic SVs for which we could infer genotypes (Additional file 27) [19]. Based on population data, rare SVs with minimal allele frequency ≤ 0.05 showed the lowest correlation with surrounding SNPs, thus indicating that these SVs were least imputable (Figure 5). The fraction of imputable SVs will be even lower when multi-allelic and complex SVs are considered because the new mutation rate at these sites is higher. Figure 5 Tagging pattern for HuRef SVs as a function of its minimum allele frequency (MAF). Linkage disequilibrium is depicted as the best r 2 between a SV and a HapMap SNP in 120 Europeans (CEU). There were a total of 405 bi-allelic polymorphic SV sites of overlap between GSV and HuRef loci; 24% of the SV loci have a HapMap SNP with r 2 50% reciprocal overlap and opposite orientation of variants identified in NA15510 in Conrad et al. [19] were removed, as these were specific to the reference. The Venter sample was also run on the Affymetrix SNP Array 6.0 and Illumina BeadChip 1 M genotyping arrays. We followed the protocol recommended by the manufacturers. For Affymetrix 6.0, the default parameters in the BirdSeed v2 algorithm were used to perform SNP calling. Partek Genomics Suite (Partek Inc., St. Louis, Missouri, USA), Genotyping Console (Affymetrix, Inc., Santa Clara, California, USA), BirdSuite [42] and iPattern (J Zhang et al., manuscript submitted) were used to call CNVs. For Illumina 1 M, the SNP calling was done using the BeadStudio software. QuantiSNP [43] and iPattern were used to identify CNVs. For both platforms, only variants confirmed by at least two calling algorithms were included in the final set of calls. The Agilent Custom Human 244 K CGH array (Agilent 244 K) was designed to target 9,018 sequences >500 bp in length that were annotated as 'unmatched' sequences in Khaja et al. [16]. CGH experiments were performed with genomic DNA from Venter and six HapMap samples, hybridized against reference NA10851. Feature extraction and normalization were performed using the Agilent feature extraction software. The programs ADM-1 in the DNA Analytics 4.0 suite (Agilent Technologies, Santa Clara, California, USA), and GADA [44] were independently used to call CNVs, and those that were confirmed by both algorithms were then used in this study. Non-redundant variant data set To generate a non-redundant set of Venter variants, we combined the lists of SVs generated. For CNVs, to determine if two calls are the same, we required that they shared a minimum of 50% size reciprocal overlap; for inversions, we required that they shared at least one boundary. For those calls that were indicated to be the same variant, we recorded the one with the best size/boundary estimate (with preference given to assembly comparison, then split-read, NimbleGen-42 M, Agilent 24 M, mate-pair, Affymetrix 6.0, and Illumina 1 M, in that order). For this analysis, we excluded variants called in the custom Agilent 244 K arrays. PCR and quantitative real-time PCR validation We used multiple computational and experimental approaches to validate SVs found in this project. PCR primers were designed to target flanking sequences of indels detected by sequencing-based methods, such that PCR products representing the different alleles can be differentiated on a 1.5% agarose gel. DNA from Venter and five HapMap individuals of European ancestry were tested in PCR experiments. Amplifications and deletions detected by CGH arrays were tested by qPCR. DNA from Venter and six additional control individuals were used to assess the variability in copy number. Each assay was run in triplicate and the FOXP2 gene was used as the reference for relative quantifications. See Additional file 12 for all primer sequences. FISH validation To validate large variants, FISH experiments were performed using fosmid clones as probes on a lymphoblastoid cell line from Venter and seven other HapMap individuals. Five metaphases were first imaged to check for correct chromosome localization and hybridization, and then interphase FISH was performed to validate predicted inversions, similar to the protocol outlined in the Feuk et al. study [25] with the addition of the aqua probe, DEAC-5-dUTP (Perkin Elmer, Waltham, Massachusetts, USA; NEL455). Overlap analysis Overlap with other datasets, genomic features and between subsets of data in the current paper was performed using custom PERL scripts. When comparing variants, two sites were considered overlapping if the reciprocal overlap among their estimated sizes was ≥ 50%. Data sources used for the annotations of overlaps with genomic features are listed in Additional file 17. To evaluate significance, we created 1,000 randomized sets of simulated variant calls and performed overlap analysis against the same data source. For each simulation, we recorded the number of instances where we observed a higher number of overlaps than the real variant data set. A P-value was computed as the fraction of simulations whose number of overlaps was greater than the number of real overlaps. Structural variation imputation Using a cutoff of 50% reciprocal overlap, there were 405 sites of overlap between the Venter and genotyped, validated Genome Structural Variation (GSV) loci. The best r 2 value was computed between each of those GSV CNVs and a European's HapMap SNP in the neighboring genomic region. Here, we defined a minimum threshold of r 2 = 0.8, below which the Venter SVs were deemed not well imputed by SNP. Detailed description on genotyping, phasing, and tagging calls onto haplotypes defined by HapMap SNPs is presented in the Conrad et al. study [19]. Data release The sequence trace files generated from previous studies [1,39] can be obtained from the 'NCBI Trace Archive', using queries [CENTER_NAME = "JCVI" and SPECIES_CODE = "HOMO SAPIENS" and center_project = "GENOMIC-SEQUENCING-DIPLOID-HUMAN-REFERENCE-GENOME"], [INSERT_SIZE = 10201 and CENTER_NAME = "CRA" and SPECIES_CODE = "homo sapiens"], and [INSERT_SIZE = 1925 and CENTER_NAME = "CRA" and SPECIES_CODE = "homo sapiens"]. All of the microarray data generated in this study are available at the Gene Expression Omnibus (GEO) under the accession number [GEO:GSE20290]. The SV locations, size, and zygosity (when available), are reported in Additional files 3, 4, 5, 6, 7, 8 and 9, and a non-redundant set of variant data in the Venter genome is reported in Additional files 19, 20 and 21. Abbreviations bp: base pair; CGH: comparative genomic hybridization; CNV: copy number variation; FISH: fluorescence in situ hybridization; GSV: Genome Structural Variation; indel: insertion/deletion; NCBI: National Center for Biotechnology Information; NGS: next generation sequencing; OMIM: Online Mendelian Inheritance in Man; qPCR: quantitative real-time PCR; SINE: short interspersed nuclear element; SNP: single nucleotide polymorphism; SV: structural variation. Authors' contributions AWP, JRM, DP, DFC, HP, MEH, CL, JCV, EFK, SL, LF and SWS conceived and designed the experiments. AWP, JRM, JW, MAR, and LF performed the mate-pair and split-read analysis, as well as the Affymetrix 6.0 and Illumina 1 M experiments. HP and CL performed the Agilent 24 M experiments, while DP, DFC, and MEH did the NimbleGen 42 M experiments. All authors analyzed the data. AWP, LF and SWS wrote the paper. All authors read and approved the final manuscript. Supplementary Material Additional file 1 Genetic variation in sequenced genomes. Click here for file Additional file 2 Clone library information. Click here for file Additional file 3 Mate-pair variants and comparison with various data sets. Click here for file Additional file 4 Split-read variants and comparison with various data sets. Click here for file Additional file 5 Agilent 24 M variants and comparison with various data sets. Click here for file Additional file 6 NimbleGen 42 M variants and comparison with various data sets. Click here for file Additional file 7 Affymetrix 6.0 variants and comparison with various data sets. Click here for file Additional file 8 Illumina 1 M variants and comparison with various data sets. Click here for file Additional file 9 Custom Agilent 244 K copy number variants. Click here for file Additional file 10 Custom Agilent 244 K copy number variable-scaffolds anchoring information. Click here for file Additional file 11 Example of a PCR-validated insertion event with size 84 bp predicted by the split-read approach. A pair of primers, separated by 497 bp was designed surrounding the insertion site. PCR was run with these primers, and the presence of the insertion was resolved by gel electrophoresis. Starting from the right, DNA from five European controls, DNA from Venter and a negative control were added in lanes 1 to 5, lane 6 and lane 7, respectively. Click here for file Additional file 12 List of validated variants and their primers and probes. Click here for file Additional file 13 Example of a qPCR-validated gain in Venter relative to sample NA10851 as detected by the custom Agilent 244 K aCGH. A 4.2-kb CNV was detected on the Celera scaffold GA_x5YUVVTY6, and by qPCR, we found that NA10851 had a heterozygous loss in that region, thus confirming a relative gain in Venter. Click here for file Additional file 14 A common inversion on 16p12.2 validated by FISH. (a) A 2-Mb website schematic of the region. This 1.1-Mb inversion was detected by the mate-pair method in Venter as seen in track 'B_Clone'. The track 'Inversions' shows that this inversion was annotated in three other studies [15,17,18]. (b) An image of a four-color FISH experiment revealing that Venter is homozygous for the 16p12.2 inverted allele. Four differentially labeled fosmid probes were scored in >100 interphase FISH experiments and the order of the probes in Venter were found in the vast majority of experiments (including in seven HapMap controls from four different populations) to be in the yellow-green-blue-pink order. In the absence of the inversion, the order of the probes would be yellow-blue-green-pink as depicted in the assembly schematic. Therefore, as discussed in the main text our data suggest that the NCBI build 36 reference represents a rare allele, or may be incorrect. Click here for file Additional file 15 Comparative analysis of variants discovered in Levy et al. [1] and the current study. The two graphs illustrate the proportion of SVs identified by the assembly comparison method, by our present combined multi-approach strategy (including mate-pair, split-read, CGH arrays and SNP arrays), and the proportion confirmed by both. The x-axis represents size range, while the numbers at the top indicate the total number of calls in a particular size range. As size increases, the number of variants called by assembly comparison decreases significantly, so this indicates that the method has limited sensitivity in detecting large calls. In contrast, our combined multi-approach strategy in the current study is more suitable in finding large variation. (a) Size distribution of gains. (b) Size distribution of losses. Click here for file Additional file 16 Cumulative distribution of probe coverage. (a) Agilent 24 M array probe coverage across NimbleGen 24 M variants. The x-axis begins at 5 - the minimum requirement to call variants on the Agilent array. Hence, the majority of the unconfirmed NimbleGen variants (approximately 70%) were targeted less than five Agilent probes. (b) NimbleGen 42 M array probe coverage across Agilent 24 M variants. The x-axis begins at 10, which is the required number of probes for the NimbleGen array to make a call. Click here for file Additional file 17 A summary list of structural variants overlap with genomic features. Click here for file Additional file 18 Genome-wide distribution of large SVs in Venter. The sites of 2,772 SVs whose position spans >1 kb are shown. Red bars represent insertion or duplication, blue bars represent deletions, and green bars represent inversions. Click here for file Additional file 19 A non-redundant set of Venter insertions and duplications. Click here for file Additional file 20 A non-redundant set of Venter deletions. Click here for file Additional file 21 A non-redundant set of Venter inversions. Click here for file Additional file 22 List of Venter gains that overlap with exons of RefSeq genes. Click here for file Additional file 23 List of Venter losses that overlap with exons of RefSeq genes. Click here for file Additional file 24 List of Venter gains that overlap with exons of OMIM genes. Click here for file Additional file 25 List of Venter losses that overlap with exons of OMIM genes. Click here for file Additional file 26 A detailed list of genes that are completely encompassed with non-redundant gains and losses. Click here for file Additional file 27 Comparison of Venter SVs with population-based genotyped and SNP-imputable CNVs. Click here for file
                Bookmark

                Author and article information

                Contributors
                volker.straub@ncl.ac.uk
                Journal
                Genet Med
                Genet. Med
                Genetics in Medicine
                Nature Publishing Group US (New York )
                1098-3600
                1530-0366
                11 June 2020
                11 June 2020
                2020
                : 22
                : 9
                : 1478-1488
                Affiliations
                [1 ]GRID grid.420004.2, ISNI 0000 0004 0444 2244, John Walton Muscular Dystrophy Research Centre, Translational and Clinical Research Institute, , Newcastle University and Newcastle Hospitals NHS Foundation Trust, ; Newcastle upon Tyne, UK
                [2 ]GRID grid.32224.35, ISNI 0000 0004 0386 9924, Analytic and Translational Genetics Unit, Massachusetts General Hospital, ; Boston, MA USA
                [3 ]GRID grid.66859.34, Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, ; Cambridge, MA USA
                [4 ]GRID grid.240404.6, ISNI 0000 0001 0440 1889, Queen’s Medical Centre, , Nottingham University Hospitals NHS Trust, ; Nottingham, UK
                [5 ]GRID grid.413396.a, ISNI 0000 0004 1768 8905, Neuromuscular Disorders Unit, Neurology Department, , Hospital de la Santa Creu i Sant Pau, ; Barcelona, Spain
                [6 ]GRID grid.411414.5, ISNI 0000 0004 0626 3418, Neuromuscular Reference Centre, Department of Neurology, , Antwerp University Hospital, ; Antwerpen, Belgium
                [7 ]GRID grid.5284.b, ISNI 0000 0001 0790 3681, Laboratory of Neuromuscular Pathology, Institute Born-Bunge, , University of Antwerp, ; Antwerpen, Belgium
                [8 ]GRID grid.4808.4, ISNI 0000 0001 0657 4636, Clinical Medical Centre Zagreb, , University of Zagreb, Zagreb Medical School, ; Zagreb, Republic of Croatia
                [9 ]Colentina University Hospital-CDPC-CCC, Bucharest, Romania
                [10 ]GRID grid.8194.4, ISNI 0000 0000 9828 7548, University of Medicine and Pharmacy Carol Davila Bucharest, Faculty of Dental Medicine, ; Bucharest, Romania
                [11 ]GRID grid.5963.9, Department of Neuropediatrics, Faculty of Medicine, , University of Freiburg, ; Freiburg, Germany
                [12 ]GRID grid.410563.5, ISNI 0000 0004 0621 0092, Department of Neurology, UMHAT, , Alexandrovska Medical University, ; Sofia, Bulgaria
                [13 ]Department of Neurology, University Hospitals Leuven, and Laboratory for Muscle Diseases and Neuropathies, KU Leuven, Leuven, Belgium
                [14 ]GRID grid.411160.3, ISNI 0000 0001 0663 8628, Unitat de Patologia Neuromuscular, Servei de Neurologia, Hospital Sant Joan de Deu, ISCIII, CIBERER, ; Barcelona, Spain
                [15 ]GRID grid.4989.c, ISNI 0000 0001 2348 0746, Department of Pediatric Neurology, Neuromuscular Reference Center, Hôpital Erasme, , Université Libre de Bruxelles, ; Brussels, Belgium
                [16 ]GRID grid.4989.c, ISNI 0000 0001 2348 0746, Department of Pediatric Neurology, Neuromuscular Reference Center, , Hôpital Universitaire des Enfants Reine Fabiola, Université Libre de Bruxelles, ; Brussels, Belgium
                [17 ]GRID grid.410566.0, ISNI 0000 0004 0626 3303, Neuromuscular Reference Center, , Universitair Ziekenhuis Gent (UZ Gent), ; Ghent, Belgium
                [18 ]GRID grid.4989.c, ISNI 0000 0001 2348 0746, Neuromuscular Reference Center, Hôpital Universitaire des Enfants Reine Fabiola (HUDERF), ULB, ; Brussels, Belgium
                [19 ]GRID grid.452372.5, ISNI 0000 0004 1791 1185, Centro de Investigación Biomédica en Red en Enfermedades Raras (CIBERER), ; Madrid, Spain
                [20 ]Instituto de Investigación Hospital 12 de Octubre Centro de Actividades Ambulatorias, Madrid, Spain
                [21 ]GRID grid.415490.d, ISNI 0000 0001 2177 007X, West of Scotland Centre for Genomic Medicine, , Queen Elizabeth University Hospital, ; Glasgow, UK
                [22 ]GRID grid.9601.e, ISNI 0000 0001 2166 6619, Department of Neurology, , Istanbul University, Istanbul Faculty of Medicine, ; Istanbul, Turkey
                [23 ]GRID grid.7269.a, ISNI 0000 0004 0621 1570, Neuromuscular Unit, Faculty of Medicine, , Ain Shams University, ; Cairo, Egypt
                [24 ]GRID grid.415490.d, ISNI 0000 0001 2177 007X, Institute of Neurological Sciences, , Queen Elizabeth University Hospital, ; Glasgow, UK
                [25 ]Neuroscience Area, Biodonostia Research Institute, San Sebastián, Gipuzkoa Spain
                [26 ]GRID grid.414651.3, Department of Neurology, , Hospital Universitario Donostia, OSAKIDETZA, ; San Sebastián, Gipuzkoa Spain
                [27 ]GRID grid.413448.e, ISNI 0000 0000 9314 1427, CIBERNED, Instituto de Salud Carlos III, Ministry of Science, , Innovation and Universities, ; Madrid, Spain
                [28 ]GRID grid.413396.a, ISNI 0000 0004 1768 8905, Genetics Department, , Hospital de la Santa Creu i Sant Pau, ; Barcelona, Spain
                [29 ]GRID grid.4491.8, ISNI 0000 0004 1937 116X, Department of Neurology, 2nd Faculty of Medicine, , Charles University, ; Prague, Czech Republic
                [30 ]GRID grid.4488.0, ISNI 0000 0001 2111 7257, Department of Neuropediatrics, Medical Faculty Carl Gustav Carus, , Technische Universitat Dresden, ; Dresden, Germany
                [31 ]GRID grid.8664.c, ISNI 0000 0001 2165 8627, Department of Child Neurology, , Justus-Liebig-University Giessen, ; Gießen, Germany
                [32 ]GRID grid.412688.1, ISNI 0000 0004 0397 9648, Department of Pathology and Cytology, , University Hospital Center Zagreb, School of Medicine Zagreb, ; Zagreb, Republic of Croatia
                [33 ]GRID grid.497559.3, Department of Neurology, , Complejo Hospitalario de Navarra, IdisNa, ; Pamplona, Spain
                [34 ]GRID grid.414269.c, ISNI 0000 0001 0667 6181, Neuromuscular Consult Unit, Hospital Universitario Basurto, ; Bilbao, Spain
                [35 ]GRID grid.488518.8, ISNI 0000 0004 0375 2558, Riga East University Hospital, Rare Disease Unit, ; Riga, Latvia
                [36 ]GRID grid.412347.7, ISNI 0000 0004 0509 0981, Department of Neuropediatrics, , University Childrens Hospital Basel, ; Basel, Switzerland
                [37 ]GRID grid.411656.1, ISNI 0000 0004 0479 0855, Department of Neuropediatrics Inselspital Bern, ; Bern, Switzerland
                [38 ]Department of Neuropediatrics, Centre for Neuromuscular Disorders in Children and Adolescents, Children’s Hospital University Clinic Essen, University of Duisburg, Essen, Germany
                [39 ]GRID grid.13339.3b, ISNI 0000000113287408, Department of Neurology, , Medical University of Warsaw, ; Warsaw, Poland
                [40 ]GRID grid.416004.7, ISNI 0000 0001 2167 4686, The Muscle Team, RJAH Orthopaedic Hospital NHS Foundation Trust, ; Oswestry, UK
                [41 ]GRID grid.410552.7, ISNI 0000 0004 0628 215X, Deptartment Pediatric Neurology, , University of Turku and Turku University Central Hospital, ; Turku, Finland
                [42 ]GRID grid.411705.6, ISNI 0000 0001 0166 0922, Department of Neurology, Shariati Hospital, , Tehran University of Medical Sciences, ; Tehran, Iran
                [43 ]GRID grid.11480.3c, ISNI 0000000121671098, Department of Neurosciences, , University of the Basque Country UPV-EHU, ; San Sebastián, Spain
                [44 ]GRID grid.5361.1, ISNI 0000 0000 8853 2677, Department of Neurology, , Medical University of Innsbruck, ; Innsbruck, Austria
                [45 ]GRID grid.277151.7, ISNI 0000 0004 0472 0371, Centre de Référence Maladies Neuromusculaires AOC, Hôtel-Dieu, , University Hospital, ; Nantes, France
                [46 ]GRID grid.415172.4, ISNI 0000 0004 0399 4960, Department of Paediatric Neurology, , Bristol Children’s Hospital, ; Bristol, UK
                [47 ]GRID grid.84393.35, ISNI 0000 0001 0360 9602, Neuromuscular Reference Centre, Neurology Department, , Hospital Universitari i Politècnic La Fe, ; Valencia, Spain
                [48 ]GRID grid.84393.35, ISNI 0000 0001 0360 9602, Neuromuscular Disorders Unit, Instituto de Investigación Sanitaria La Fe, ; Valencia, Spain
                [49 ]GRID grid.414476.4, ISNI 0000 0001 0403 1371, Departamento de Neurología, , Hospital Galdakao, ; Galdakao, Spain
                [50 ]GRID grid.7605.4, ISNI 0000 0001 2336 6580, Neuromuscular Unit, Department of Neurosciences, , University of Turin, ; Turin, Italy
                [51 ]GRID grid.7942.8, ISNI 0000 0001 2294 713X, Cliniques Universitaires St-Luc, Centre de Référence Neuromusculaire, , Université de Louvain, ; Brussels, Belgium
                [52 ]Faculty of Medicine, University of Belgrade, Neurology Clinic, Clinical Center of Serbia, Belgrade, Serbia
                [53 ]Centre de Référence Neuromusculaire Erasme-HUDERF, Department of Neurology, CUB Hôpital Erasme, Université Libre de Bruxelles, Brussels, Belgium
                [54 ]GRID grid.5361.1, ISNI 0000 0000 8853 2677, Institute of Human Genetics, , Medical University Innsbruck, ; Innsbruck, Austria
                [55 ]GRID grid.10392.39, ISNI 0000 0001 2190 1447, Department of Child Neurology, Children’s Hospital, , University of Tübingen, ; Tübingen, Germany
                [56 ]GRID grid.5808.5, ISNI 0000 0001 1503 7226, Centro Hospitalar Universitário Porto, ; Porto, Portugal
                [57 ]GRID grid.429180.6, Institute of Neurology, Psychiatry and Narcology of NAMS of Ukraine, ; Kharkiv, Ukraine
                [58 ]GRID grid.4808.4, ISNI 0000 0001 0657 4636, Department of Laboratory Diagnostics, Clinical Medical Centre Zagreb, , University of Zagreb, Zagreb Medical School, ; Zagreb, Republic of Croatia
                [59 ]GRID grid.412468.d, ISNI 0000 0004 0646 2097, Klinik für Neuropädiatrie, , University Medical Center Schleswig-Holstein, ; Kiel, Germany
                [60 ]Folkhaelsan Department of Medical Genetics, Helsinki, Finland
                [61 ]GRID grid.15485.3d, ISNI 0000 0000 9950 5666, Department of Clinical Genetics, , HUSLAB, University of Helsinki and Helsinki University Hospital, ; Helsinki, Finland
                [62 ]GRID grid.7942.8, ISNI 0000 0001 2294 713X, Cliniques Universitaires St-Luc, Centre de Génétique Humaine, , Université de Louvain, ; Brussels, Belgium
                [63 ]GRID grid.5507.7, ISNI 0000 0001 0740 5199, Department of Cognitive Science and Psychology, New Bulgarian University, ; Sofia, Bulgaria
                [64 ]GRID grid.5252.0, ISNI 0000 0004 1936 973X, Dr. v. Hauner Children’s Hospital, Department of Pediatric Neurology and Developmental Medicine, , LMU—University of Munich, ; Munich, Germany
                [65 ]GRID grid.5254.6, ISNI 0000 0001 0674 042X, Copenhagen Neuromuscular Center Rigshospitalet, , University of Copenhagen, ; Copenhagen, Denmark
                [66 ]GRID grid.7737.4, ISNI 0000 0004 0410 2071, Department of Medical Genetics, , University of Helsinki, ; Helsinki, Finland
                [67 ]GRID grid.1006.7, ISNI 0000 0001 0462 7212, Present Address: Translational and Clinical Research Institute, , Newcastle University, ; Newcastle upon Tyne, UK
                [68 ]Present Address: Northern Genetics Service, Newcastle upon Tyne NHS Foundation Trust, Newcastle upon Tyne, UK
                [69 ]GRID grid.47100.32, ISNI 0000000419368710, Present Address: Department of Genetics, , Yale University School of Medicine, ; New Haven, CT USA
                [70 ]GRID grid.483732.9, Present Address: Directorate-General for Research and Innovation, Directorate E, Unit E2 Combatting Diseases, ; Brussels, Belgium
                [71 ]GRID grid.415306.5, ISNI 0000 0000 9983 6924, Present Address: Centre for Population Genomics, Garvan Institute of Medical Research, ; Sydney, Australia
                [72 ]GRID grid.1058.c, ISNI 0000 0000 9442 535X, Present Address: Centre for Population Genomics, Murdoch Children’s Research Institute, ; Melbourne, Australia
                [73 ]GRID grid.419334.8, ISNI 0000 0004 0641 3236, Present Address: Department of Neurology, , Royal Victoria Infirmary, Newcastle upon Tyne Hospitals NHS Foundation Trust, ; Newcastle upon Tyne, UK
                Author information
                http://orcid.org/0000-0002-9227-2526
                Article
                840
                10.1038/s41436-020-0840-3
                7462745
                32528171
                a1ac9d95-0df5-4da1-8911-3ef6a29429a2
                © The Author(s) 2020

                Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, and provide a link to the Creative Commons license. You do not have permission under this license to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

                History
                : 25 November 2019
                : 5 May 2020
                : 8 May 2020
                Categories
                Article
                Custom metadata
                © American College of Medical Genetics and Genomics 2020

                Genetics
                next-generation sequencing,targeted exome analysis,neuromuscular disease,limb-girdle weakness,genetic diagnosis

                Comments

                Comment on this article