18
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Accounting for Population Stratification in Practice: A Comparison of the Main Strategies Dedicated to Genome-Wide Association Studies

      research-article
      1 , 2 , * , 2 , 1
      PLoS ONE
      Public Library of Science

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Genome-Wide Association Studies are powerful tools to detect genetic variants associated with diseases. Their results have, however, been questioned, in part because of the bias induced by population stratification. This is a consequence of systematic differences in allele frequencies due to the difference in sample ancestries that can lead to both false positive or false negative findings. Many strategies are available to account for stratification but their performances differ, for instance according to the type of population structure, the disease susceptibility locus minor allele frequency, the degree of sampling imbalanced, or the sample size. We focus on the type of population structure and propose a comparison of the most commonly used methods to deal with stratification that are the Genomic Control, Principal Component based methods such as implemented in Eigenstrat, adjusted Regressions and Meta-Analyses strategies. Our assessment of the methods is based on a large simulation study, involving several scenarios corresponding to many types of population structures. We focused on both false positive rate and power to determine which methods perform the best. Our analysis showed that if there is no population structure, none of the tests led to a bias nor decreased the power except for the Meta-Analyses. When the population is stratified, adjusted Logistic Regressions and Eigenstrat are the best solutions to account for stratification even though only the Logistic Regressions are able to constantly maintain correct false positive rates. This study provides more details about these methods. Their advantages and limitations in different stratification scenarios are highlighted in order to propose practical guidelines to account for population stratification in Genome-Wide Association Studies.

          Related collections

          Most cited references31

          • Record: found
          • Abstract: found
          • Article: not found

          Association mapping in structured populations.

          The use, in association studies, of the forthcoming dense genomewide collection of single-nucleotide polymorphisms (SNPs) has been heralded as a potential breakthrough in the study of the genetic basis of common complex disorders. A serious problem with association mapping is that population structure can lead to spurious associations between a candidate marker and a phenotype. One common solution has been to abandon case-control studies in favor of family-based tests of association, such as the transmission/disequilibrium test (TDT), but this comes at a considerable cost in the need to collect DNA from close relatives of affected individuals. In this article we describe a novel, statistically valid, method for case-control association studies in structured populations. Our method uses a set of unlinked genetic markers to infer details of population structure, and to estimate the ancestry of sampled individuals, before using this information to test for associations within subpopulations. It provides power comparable with the TDT in many settings and may substantially outperform it if there are conflicting associations in different subpopulations.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Population stratification and spurious allelic association.

            Great efforts and expense have been expended in attempts to detect genetic polymorphisms contributing to susceptibility to complex human disease. Concomitantly, technology for detection and scoring of single nucleotide polymorphisms (SNPs) has undergone rapid development, extensive catalogues of SNPs across the genome have been constructed, and SNPs have been increasingly used as a means for investigation of the genetic causes of complex human diseases. For many diseases, population-based studies of unrelated individuals--in which case-control and cohort studies serve as standard designs for genetic association analysis--can be the most practical and powerful approach. However, extensive debate has arisen about optimum study design, and considerable concern has been expressed that these approaches are prone to population stratification, which can lead to biased or spurious results. Over the past decade, a great shift has been noted, away from case-control and cohort studies, towards family-based association designs. These designs have fewer problems with population stratification but have greater genotyping and sampling requirements, and data can be difficult or impossible to gather. We discuss past evidence for population stratification on genotype-phenotype association studies, review methods to detect and account for it, and present suggestions for future study design and analysis.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Combining probability from independent tests: the weighted Z-method is superior to Fisher's approach.

              The most commonly used method in evolutionary biology for combining information across multiple tests of the same null hypothesis is Fisher's combined probability test. This note shows that an alternative method called the weighted Z-test has more power and more precision than does Fisher's test. Furthermore, in contrast to some statements in the literature, the weighted Z-method is superior to the unweighted Z-transform approach. The results in this note show that, when combining P-values from multiple tests of the same hypothesis, the weighted Z-method should be preferred.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS One
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, USA )
                1932-6203
                2011
                21 December 2011
                : 6
                : 12
                : e28845
                Affiliations
                [1 ]Department of Biostatistics, Pharnext, Paris, France
                [2 ]Statistics and Genome Laboratory, University of Evry Val d'Essonne, UMR CNRS 8071 - USC INRA, Evry, France
                Aarhus University, Denmark
                Author notes

                Conceived and designed the experiments: MB MG CA. Performed the experiments: MB. Analyzed the data: MB MG. Contributed reagents/materials/analysis tools: MB. Wrote the paper: MB MG CA. Significantly contributed to the paper: MB MG CA.

                Article
                PONE-D-11-14474
                10.1371/journal.pone.0028845
                3244428
                22216125
                2c1d1721-bcf3-4978-84fc-2350363ab626
                Bouaziz et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
                History
                : 21 July 2011
                : 16 November 2011
                Page count
                Pages: 13
                Categories
                Research Article
                Biology
                Computational Biology
                Genomics
                Genome Analysis Tools
                Genome-Wide Association Studies
                Population Genetics
                Evolutionary Biology
                Population Genetics
                Genetics
                Human Genetics
                Genome-Wide Association Studies
                Genetics of Disease
                Genome-Wide Association Studies
                Population Genetics
                Genomics
                Genome Analysis Tools
                Genome-Wide Association Studies
                Population Biology
                Population Genetics
                Mathematics
                Statistics
                Biostatistics

                Uncategorized
                Uncategorized

                Comments

                Comment on this article