171
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Rare variant discovery by deep whole-genome sequencing of 1,070 Japanese individuals

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The Tohoku Medical Megabank Organization reports the whole-genome sequences of 1,070 healthy Japanese individuals and construction of a Japanese population reference panel (1KJPN). Here we identify through this high-coverage sequencing (32.4 × on average), 21.2 million, including 12 million novel, single-nucleotide variants (SNVs) at an estimated false discovery rate of <1.0%. This detailed analysis detected signatures for purifying selection on regulatory elements as well as coding regions. We also catalogue structural variants, including 3.4 million insertions and deletions, and 25,923 genic copy-number variants. The 1KJPN was effective for imputing genotypes of the Japanese population genome wide. These data demonstrate the value of high-coverage sequencing for constructing population-specific variant panels, which covers 99.0% SNVs of minor allele frequency ≥0.1%, and its value for identifying causal rare variants of complex human disease phenotypes in genetic association studies.

          Abstract

          The Tohoku Medical Megabank Organization establishes a biobank with detailed patient health care and genome information. Here the authors analyse whole-genome sequences of 1,070 Japanese individuals, allowing them to catalogue 21 million single-nucleotide variants including 12 million novel ones.

          Related collections

          Most cited references44

          • Record: found
          • Abstract: found
          • Article: not found

          Diet and the evolution of human amylase gene copy number variation.

          Starch consumption is a prominent characteristic of agricultural societies and hunter-gatherers in arid environments. In contrast, rainforest and circum-arctic hunter-gatherers and some pastoralists consume much less starch. This behavioral variation raises the possibility that different selective pressures have acted on amylase, the enzyme responsible for starch hydrolysis. We found that copy number of the salivary amylase gene (AMY1) is correlated positively with salivary amylase protein level and that individuals from populations with high-starch diets have, on average, more AMY1 copies than those with traditionally low-starch diets. Comparisons with other loci in a subset of these populations suggest that the extent of AMY1 copy number differentiation is highly unusual. This example of positive selection on a copy number-variable gene is, to our knowledge, one of the first discovered in the human genome. Higher AMY1 copy numbers and protein levels probably improve the digestion of starchy foods and may buffer against the fitness-reducing effects of intestinal disease.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Large recurrent microdeletions associated with schizophrenia.

            Reduced fecundity, associated with severe mental disorders, places negative selection pressure on risk alleles and may explain, in part, why common variants have not been found that confer risk of disorders such as autism, schizophrenia and mental retardation. Thus, rare variants may account for a larger fraction of the overall genetic risk than previously assumed. In contrast to rare single nucleotide mutations, rare copy number variations (CNVs) can be detected using genome-wide single nucleotide polymorphism arrays. This has led to the identification of CNVs associated with mental retardation and autism. In a genome-wide search for CNVs associating with schizophrenia, we used a population-based sample to identify de novo CNVs by analysing 9,878 transmissions from parents to offspring. The 66 de novo CNVs identified were tested for association in a sample of 1,433 schizophrenia cases and 33,250 controls. Three deletions at 1q21.1, 15q11.2 and 15q13.3 showing nominal association with schizophrenia in the first sample (phase I) were followed up in a second sample of 3,285 cases and 7,951 controls (phase II). All three deletions significantly associate with schizophrenia and related psychoses in the combined sample. The identification of these rare, recurrent risk variants, having occurred independently in multiple founders and being subject to negative selection, is important in itself. CNV analysis may also point the way to the identification of additional and more prevalent risk variants in genes and pathways involved in schizophrenia.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Toward better understanding of artifacts in variant calling from high-coverage samples.

              Heng Li (2014)
              Whole-genome high-coverage sequencing has been widely used for personal and cancer genomics as well as in various research areas. However, in the lack of an unbiased whole-genome truth set, the global error rate of variant calls and the leading causal artifacts still remain unclear even given the great efforts in the evaluation of variant calling methods. We made 10 single nucleotide polymorphism and INDEL call sets with two read mappers and five variant callers, both on a haploid human genome and a diploid genome at a similar coverage. By investigating false heterozygous calls in the haploid genome, we identified the erroneous realignment in low-complexity regions and the incomplete reference genome with respect to the sample as the two major sources of errors, which press for continued improvements in these two areas. We estimated that the error rate of raw genotype calls is as high as 1 in 10-15 kb, but the error rate of post-filtered calls is reduced to 1 in 100-200 kb without significant compromise on the sensitivity. BWA-MEM alignment and raw variant calls are available at http://bit.ly/1g8XqRt scripts and miscellaneous data at https://github.com/lh3/varcmp. hengli@broadinstitute.org Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
                Bookmark

                Author and article information

                Journal
                Nat Commun
                Nat Commun
                Nature Communications
                Nature Pub. Group
                2041-1723
                21 August 2015
                2015
                : 6
                : 8018
                Affiliations
                [1 ]Tohoku Medical Megabank Organization, Tohoku University , 2-1, Seiryo-machi, Aoba-ku, Sendai 980-8573, Japan
                [2 ]Graduate School of Medicine, Tohoku University , 2-1, Seiryo-machi, Aoba-ku, Sendai 980-8575, Japan
                [3 ]Graduate School of Information Sciences, Tohoku University , 6-3-09, Aramaki Aza-Aoba, Aoba-ku, Sendai 980-8579, Japan
                [4 ]International Research Institute of Disaster Science, Tohoku University , 468-1, Aramaki Aza-Aoba, Aoba-ku, Sendai 980-0845, Japan
                [5 ]Department of Cell and Developmental Biology, University of Michigan Medical School , 109 Zina Pitcher Place, Ann Arbor, Michigan 48109-2200, USA
                [6 ]Institute of Development, Aging and Cancer, Tohoku University , 4-1, Seiryo-machi, Aoba-ku, Sendai 980-8575, Japan
                [7 ]Graduate School of Dentistry, Tohoku University , 4-1 Seiryo-machi, Aoba-ku, Sendai 980–8575, Japan.
                Author notes
                [*]

                These authors contributed equally to this work

                [†]

                Present address: Institute for Genomic Medicine, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093, USA

                [‡]

                Present address: National Cancer Center Research Institute, 5-1-1, Tsukiji, Chuo-ku, Tokyo 104-0045, Japan

                [§]

                Present address: National Center for Child Health and Development, National Medical Center for Children and Mothers Research Institute, 2-10-1, Okura, Setagaya-ku, Tokyo 157-8535, Japan

                Author information
                http://orcid.org/0000-0003-1516-7973
                Article
                ncomms9018
                10.1038/ncomms9018
                4560751
                26292667
                18940f3a-c8ba-475c-ba2b-596b99f0b634
                Copyright © 2015, Nature Publishing Group, a division of Macmillan Publishers Limited. All Rights Reserved.

                This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

                History
                : 22 November 2014
                : 07 July 2015
                Categories
                Article

                Uncategorized
                Uncategorized

                Comments

                Comment on this article