0
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      The landscape of genomic structural variation in Indigenous Australians

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Indigenous Australians harbour rich and unique genomic diversity. However, Aboriginal and Torres Strait Islander ancestries are historically under-represented in genomics research and almost completely missing from reference datasets 13 . Addressing this representation gap is critical, both to advance our understanding of global human genomic diversity and as a prerequisite for ensuring equitable outcomes in genomic medicine. Here we apply population-scale whole-genome long-read sequencing 4 to profile genomic structural variation across four remote Indigenous communities. We uncover an abundance of large insertion–deletion variants (20–49 bp; n = 136,797), structural variants (50  b–50 kb; n = 159,912) and regions of variable copy number (>50 kb; n = 156). The majority of variants are composed of tandem repeat or interspersed mobile element sequences (up to 90%) and have not been previously annotated (up to 62%). A large fraction of structural variants appear to be exclusive to Indigenous Australians (12% lower-bound estimate) and most of these are found in only a single community, underscoring the need for broad and deep sampling to achieve a comprehensive catalogue of genomic structural variation across the Australian continent. Finally, we explore short tandem repeats throughout the genome to characterize allelic diversity at 50 known disease loci 5 , uncover hundreds of novel repeat expansion sites within protein-coding genes, and identify unique patterns of diversity and constraint among short tandem repeat sequences. Our study sheds new light on the dimensions and dynamics of genomic structural variation within and beyond Australia.

          Abstract

          Population-scale whole-genome sequencing across four remote Indigenous Australian communities reveals a large fraction of structural variants that are unique to these populations, emphasizing the genetic distinctiveness of and diversity among Indigenous Australians.

          Related collections

          Most cited references51

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          A global reference for human genetic variation

          The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Minimap2: pairwise alignment for nucleotide sequences

            Heng Li (2018)
            Recent advances in sequencing technologies promise ultra-long reads of ∼100 kb in average, full-length mRNA or cDNA reads in high throughput and genomic contigs over 100 Mb in length. Existing alignment programs are unable or inefficient to process such data at scale, which presses for the development of new alignment algorithms.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              The mutational constraint spectrum quantified from variation in 141,456 humans

              Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism will be depleted of such variants in natural populations, whereas non-essential genes will tolerate their accumulation. However, predicted loss-of-function variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes 1 . Here we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence predicted loss-of-function variants in this cohort after filtering for artefacts caused by sequencing and annotation errors. Using an improved model of human mutation rates, we classify human protein-coding genes along a spectrum that represents tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve the power of gene discovery for both common and rare diseases.
                Bookmark

                Author and article information

                Contributors
                hardip.patel@anu.edu.au
                i.deveson@garvan.org.au
                Journal
                Nature
                Nature
                Nature
                Nature Publishing Group UK (London )
                0028-0836
                1476-4687
                13 December 2023
                13 December 2023
                2023
                : 624
                : 7992
                : 602-610
                Affiliations
                [1 ]Genomics and Inherited Disease Program, Garvan Institute of Medical Research, ( https://ror.org/01b3dvp57) Sydney, New South Wales Australia
                [2 ]GRID grid.415306.5, ISNI 0000 0000 9983 6924, Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children’s Research Institute, ; Darlinghurst, New South Wales Australia
                [3 ]Faculty of Medicine, University of New South Wales, ( https://ror.org/03r8z3t63) Sydney, New South Wales Australia
                [4 ]School of Computer Science and Engineering, University of New South Wales, ( https://ror.org/03r8z3t63) Sydney, New South Wales Australia
                [5 ]GRID grid.1001.0, ISNI 0000 0001 2180 7477, National Centre for Indigenous Genomics, John Curtin School of Medical Research, , Australian National University, ; Canberra, Australian Capital Territory Australia
                [6 ]GRID grid.1039.b, ISNI 0000 0004 0385 7472, Institute for Applied Ecology, , University of Canberra, ; Canberra, Australian Capital Territory Australia
                [7 ]Department of Ophthalmology, Flinders University, ( https://ror.org/01kpzv902) Bedford Park, South Australia Australia
                [8 ]Menzies Institute for Medical Research, University of Tasmania, ( https://ror.org/01nfmeh72) Hobart, Tasmania Australia
                [9 ]Australian Centre for Ancient DNA, School of Biological Sciences and Environment Institute, University of Adelaide, ( https://ror.org/00892tw58) Adelaide, South Australia Australia
                [10 ]ARC Centre of Excellence for Australian Biodiversity and Heritage, University of Adelaide, ( https://ror.org/00892tw58) Adelaide, South Australia Australia
                [11 ]Indigenous Genomics, Telethon Kids Institute, ( https://ror.org/01dbmzx78) Adelaide, South Australia Australia
                [12 ]Telethon Kids Institute and Division of Paediatrics, Faculty of Health and Medical Sciences, University of Western Australia, ( https://ror.org/047272k79) Perth, Western Australia Australia
                [13 ]Genetic Services of Western Australia, Western Australian Department of Health, ( https://ror.org/01epcny94) Perth, Western Australia Australia
                [14 ]Western Australian Register of Developmental Anomalies, Western Australian Department of Health, ( https://ror.org/01epcny94) Perth, Western Australia Australia
                [15 ]Immunology Division, The Walter and Eliza Hall Institute of Medical Research, ( https://ror.org/01b6kha49) Parkville, Victoria Australia
                [16 ]Molly Wardaguga Research Centre, Faculty of Health, Charles Darwin University, ( https://ror.org/048zcaj52) Brisbane City, Queensland Australia
                [17 ]Mater Research Institute and School of Nursing and Midwifery, University of Queensland, ( https://ror.org/00rqy9422) South Brisbane, Queensland Australia
                [18 ]The Lowitja Institute, ( https://ror.org/01nfhtc03) Melbourne, Victoria Australia
                [19 ]Office of the Registrar of Indigenous Corporations, ( https://ror.org/03fy7b149) Canberra, Australian Capital Territory Australia
                [20 ]School of Media and Communication, RMIT University, ( https://ror.org/04ttjf776) Melbourne, Victoria Australia
                [21 ]GRID grid.1017.7, ISNI 0000 0001 2163 3550, Centre of Excellence for Automated Decision-Making and Society, , RMIT University, ; Melbourne, Victoria Australia
                [22 ]GRID grid.1016.6, ISNI 0000 0001 2173 2719, Commonwealth Scientific and Industrial Research Organisation, ; Canberra, Australian Capital Territory Australia
                [23 ]GRID grid.1001.0, ISNI 0000 0001 2180 7477, The Australian National University, ; Canberra, Australian Capital Territory Australia
                Author information
                http://orcid.org/0000-0002-7300-1157
                http://orcid.org/0000-0002-4900-2838
                http://orcid.org/0000-0002-4045-4571
                http://orcid.org/0000-0002-9034-9905
                http://orcid.org/0000-0002-7713-1979
                http://orcid.org/0000-0003-2840-4851
                http://orcid.org/0000-0002-5550-9176
                http://orcid.org/0000-0003-4920-9553
                http://orcid.org/0000-0003-1301-405X
                http://orcid.org/0000-0002-0462-502X
                http://orcid.org/0000-0001-6564-2715
                http://orcid.org/0000-0003-3861-0472
                Article
                6842
                10.1038/s41586-023-06842-7
                10733147
                38093003
                fc01cf53-cb7d-46a8-be4e-fad80e36492b
                © The Author(s) 2023

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 16 January 2023
                : 7 November 2023
                Categories
                Article
                Custom metadata
                © Springer Nature Limited 2023

                Uncategorized
                medical genomics,structural variation,next-generation sequencing,genetic variation

                Comments

                Comment on this article