3
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A comparison of marker-based estimators of inbreeding and inbreeding depression

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          The availability of genome-wide marker data allows estimation of inbreeding coefficients ( F, the probability of identity-by-descent, IBD) and, in turn, estimation of the rate of inbreeding depression (ΔID). We investigated, by computer simulations, the accuracy of the most popular estimators of inbreeding based on molecular markers when computing F and ΔID in populations under random mating, equalization of parental contributions, and artificially selected populations. We assessed estimators described by Li and Horvitz ( F LH1 and F LH2 ), VanRaden ( F VR1 and F VR2 ), Yang and colleagues ( F YA1 and F YA2 ), marker homozygosity ( F HOM ), runs of homozygosity ( F ROH ) and estimates based on pedigree ( F PED ) in comparison with estimates obtained from IBD measures ( F IBD ).

          Results

          If the allele frequencies of a base population taken as a reference for the computation of inbreeding are known, all estimators based on marker allele frequencies are highly correlated with F IBD and provide accurate estimates of the mean ΔID. If base population allele frequencies are unknown and current frequencies are used in the estimations, the largest correlation with F IBD is generally obtained by F LH1 and the best estimator of ΔID is F YA2 . The estimators F VR2 and F LH2 have the poorest performance in most scenarios. The assumption that base population allele frequencies are equal to 0.5 results in very biased estimates of the average inbreeding coefficient but they are highly correlated with F IBD and give relatively good estimates of ΔID. Estimates obtained directly from marker homozygosity ( F HOM ) substantially overestimated ΔID. Estimates based on runs of homozygosity ( F ROH ) provide accurate estimates of inbreeding and ΔID. Finally, estimates based on pedigree ( F PED ) show a lower correlation with F IBD than molecular estimators but provide rather accurate estimates of ΔID. An analysis of data from a pig population supports the main findings of the simulations.

          Conclusions

          When base population allele frequencies are known, all marker-allele frequency-based estimators of inbreeding coefficients generally show a high correlation with F IBD and provide good estimates of ΔID. When base population allele frequencies are unknown, F LH1 is the marker frequency-based estimator that is most correlated with F IBD , and F YA2 provides the most accurate estimates of ΔID. Estimates from F ROH are also very precise in most scenarios. The estimators F VR2 and F LH2 have the poorest performances.

          Supplementary Information

          The online version contains supplementary material available at 10.1186/s12711-022-00772-0.

          Related collections

          Most cited references78

          • Record: found
          • Abstract: found
          • Article: not found

          PLINK: a tool set for whole-genome association and population-based linkage analyses.

          Whole-genome association studies (WGAS) bring new computational, as well as analytic, challenges to researchers. Many existing genetic-analysis tools are not designed to handle such large data sets in a convenient manner and do not necessarily exploit the new opportunities that whole-genome data bring. To address these issues, we developed PLINK, an open-source C/C++ WGAS tool set. With PLINK, large data sets comprising hundreds of thousands of markers genotyped for thousands of individuals can be rapidly manipulated and analyzed in their entirety. As well as providing tools to make the basic analytic steps computationally efficient, PLINK also supports some novel approaches to whole-genome data that take advantage of whole-genome coverage. We introduce PLINK and describe the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation. In particular, we focus on the estimation and use of identity-by-state and identity-by-descent information in the context of population-based whole-genome studies. This information can be used to detect and correct for population stratification and to identify extended chromosomal segments that are shared identical by descent between very distantly related individuals. Analysis of the patterns of segmental sharing has the potential to map disease loci that contain multiple rare variants in a population-based linkage analysis.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Second-generation PLINK: rising to the challenge of larger and richer datasets

            PLINK 1 is a widely used open-source C/C++ toolset for genome-wide association studies (GWAS) and research in population genetics. However, the steady accumulation of data from imputation and whole-genome sequencing studies has exposed a strong need for even faster and more scalable implementations of key functions. In addition, GWAS and population-genetic data now frequently contain probabilistic calls, phase information, and/or multiallelic variants, none of which can be represented by PLINK 1's primary data format. To address these issues, we are developing a second-generation codebase for PLINK. The first major release from this codebase, PLINK 1.9, introduces extensive use of bit-level parallelism, O(sqrt(n))-time/constant-space Hardy-Weinberg equilibrium and Fisher's exact tests, and many other algorithmic improvements. In combination, these changes accelerate most operations by 1-4 orders of magnitude, and allow the program to handle datasets too large to fit in RAM. This will be followed by PLINK 2.0, which will introduce (a) a new data format capable of efficiently representing probabilities, phase, and multiallelic variants, and (b) extensions of many functions to account for the new types of information. The second-generation versions of PLINK will offer dramatic improvements in performance and compatibility. For the first time, users without access to high-end computing resources can perform several essential analyses of the feature-rich and very large genetic datasets coming into use.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              GCTA: a tool for genome-wide complex trait analysis.

              For most human complex diseases and traits, SNPs identified by genome-wide association studies (GWAS) explain only a small fraction of the heritability. Here we report a user-friendly software tool called genome-wide complex trait analysis (GCTA), which was developed based on a method we recently developed to address the "missing heritability" problem. GCTA estimates the variance explained by all the SNPs on a chromosome or on the whole genome for a complex trait rather than testing the association of any particular SNP to the trait. We introduce GCTA's five main functions: data management, estimation of the genetic relationships from SNPs, mixed linear model analysis of variance explained by the SNPs, estimation of the linkage disequilibrium structure, and GWAS simulation. We focus on the function of estimating the variance explained by all the SNPs on the X chromosome and testing the hypotheses of dosage compensation. The GCTA software is a versatile tool to estimate and partition complex trait variation with large GWAS data sets.
                Bookmark

                Author and article information

                Contributors
                armando@uvigo.es
                afedez@inia.csic.es
                villanueva.beatriz@inia.csic.es
                miguel.toro@upm.es
                Journal
                Genet Sel Evol
                Genet Sel Evol
                Genetics, Selection, Evolution : GSE
                BioMed Central (London )
                0999-193X
                1297-9686
                27 December 2022
                27 December 2022
                2022
                : 54
                : 82
                Affiliations
                [1 ]GRID grid.6312.6, ISNI 0000 0001 2097 6738, Centro de Investigación Mariña, Universidade de Vigo, Facultade de Bioloxía, ; 36310 Vigo, Spain
                [2 ]Departamento de Mejora Genética Animal, INIA-CSIC, Ctra. de La Coruña, Km 7.5, 28040 Madrid, Spain
                [3 ]GRID grid.5690.a, ISNI 0000 0001 2151 2978, Departamento de Producción Agraria, ETSI Agronómica, Alimentaria y de Biosistemas, , Universidad Politécnica de Madrid, ; 28040 Madrid, Spain
                Author information
                http://orcid.org/0000-0001-7391-6974
                Article
                772
                10.1186/s12711-022-00772-0
                9793638
                36575379
                a862d496-a82b-436f-8ab3-edd07a81294c
                © The Author(s) 2022

                Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

                History
                : 13 April 2022
                : 14 December 2022
                Funding
                Funded by: FundRef http://dx.doi.org/10.13039/501100004837, Ministerio de Ciencia e Innovación;
                Award ID: PID2020-114426GB
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/501100010568, Secretaria Xeral de Investigación e Desenvolvemento, Xunta de Galicia;
                Award ID: ED431C 2020-05
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/100014510, European Maritime and Fisheries Fund;
                Award ID: PRTR-C17.I1
                Award Recipient :
                Categories
                Research Article
                Custom metadata
                © The Author(s) 2022

                Genetics
                Genetics

                Comments

                Comment on this article