2
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Assessing effectiveness of many-objective evolutionary algorithms for selection of tag SNPs

      research-article
      , , * ,
      PLOS ONE
      Public Library of Science

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Studies on genome-wide associations help to determine the cause of many genetic diseases. Genome-wide associations typically focus on associations between single-nucleotide polymorphisms (SNPs). Genotyping every SNP in a chromosomal region for identifying genetic variation is computationally very expensive. A representative subset of SNPs, called tag SNPs, can be used to identify genetic variation. Small tag SNPs save the computation time of genotyping platform, however, there could be missing data or genotyping errors in small tag SNPs. This study aims to solve Tag SNPs selection problem using many-objective evolutionary algorithms.

          Methods

          Tag SNPs selection can be viewed as an optimization problem with some trade-offs between objectives, e.g. minimizing the number of tag SNPs and maximizing tolerance for missing data. In this study, the tag SNPs selection problem is formulated as a many-objective problem. Nondominated Sorting based Genetic Algorithm (NSGA-III), and Multi-Objective Evolutionary Algorithm based on Decomposition (MOEA/D), which are Many-Objective evolutionary algorithms, have been applied and investigated for optimal tag SNPs selection. This study also investigates different initialization methods like greedy and random initialization. optimization.

          Results

          The evaluation measures used for comparing results for different algorithms are Hypervolume, Range, SumMin, MinSum, Tolerance rate, and Average Hamming distance. Overall MOEA/D algorithm gives superior results as compared to other algorithms in most cases. NSGA-III outperforms NSGA-II and other compared algorithms on maximum tolerance rate, and SPEA2 outperforms all algorithms on average hamming distance.

          Conclusion

          Experimental results show that the performance of our proposed many-objective algorithms is much superior as compared to the results of existing methods. The outcomes show the advantages of greedy initialization over random initialization using NSGA-III, SPEA2, and MOEA/D to solve the tag SNPs selection as many-objective optimization problem.

          Related collections

          Most cited references24

          • Record: found
          • Abstract: not found
          • Article: not found

          MOEA/D: A Multiobjective Evolutionary Algorithm Based on Decomposition

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            An Evolutionary Many-Objective Optimization Algorithm Using Reference-Point-Based Nondominated Sorting Approach, Part I: Solving Problems With Box Constraints

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium.

              Common genetic polymorphisms may explain a portion of the heritable risk for common diseases. Within candidate genes, the number of common polymorphisms is finite, but direct assay of all existing common polymorphism is inefficient, because genotypes at many of these sites are strongly correlated. Thus, it is not necessary to assay all common variants if the patterns of allelic association between common variants can be described. We have developed an algorithm to select the maximally informative set of common single-nucleotide polymorphisms (tagSNPs) to assay in candidate-gene association studies, such that all known common polymorphisms either are directly assayed or exceed a threshold level of association with a tagSNP. The algorithm is based on the r(2) linkage disequilibrium (LD) statistic, because r(2) is directly related to statistical power to detect disease associations with unassayed sites. We show that, at a relatively stringent r(2) threshold (r2>0.8), the LD-selected tagSNPs resolve >80% of all haplotypes across a set of 100 candidate genes, regardless of recombination, and tag specific haplotypes and clades of related haplotypes in nonrecombinant regions. Thus, if the patterns of common variation are described for a candidate gene, analysis of the tagSNP set can comprehensively interrogate for main effects from common functional variation. We demonstrate that, although common variation tends to be shared between populations, tagSNPs should be selected separately for populations with different ancestries.
                Bookmark

                Author and article information

                Contributors
                Role: MethodologyRole: SoftwareRole: Writing – original draft
                Role: ConceptualizationRole: Formal analysisRole: MethodologyRole: Project administrationRole: SupervisionRole: Writing – review & editing
                Role: Formal analysisRole: InvestigationRole: SupervisionRole: ValidationRole: Writing – original draftRole: Writing – review & editing
                Role: Editor
                Journal
                PLoS One
                PLoS One
                plos
                PLOS ONE
                Public Library of Science (San Francisco, CA USA )
                1932-6203
                2022
                8 December 2022
                : 17
                : 12
                : e0278560
                Affiliations
                [001] FAST School of Computing, National University of Computer and Emerging Sciences, Lahore, Pakistan
                Torrens University Australia, AUSTRALIA
                Author notes

                Competing Interests: The authors have declared that no competing interests exist.

                Author information
                https://orcid.org/0000-0001-6124-5317
                Article
                PONE-D-21-31031
                10.1371/journal.pone.0278560
                9731481
                36480538
                0ef0ce03-0f11-4928-80a9-185d7facad43
                © 2022 Moqa et al

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 11 October 2021
                : 19 November 2022
                Page count
                Figures: 10, Tables: 5, Pages: 24
                Funding
                The authors received no specific funding for this work.
                Categories
                Research Article
                Biology and Life Sciences
                Genetics
                Single Nucleotide Polymorphisms
                Biology and Life Sciences
                Genetics
                Heredity
                Genetic Mapping
                Haplotypes
                Physical Sciences
                Mathematics
                Applied Mathematics
                Algorithms
                Evolutionary Algorithms
                Research and Analysis Methods
                Simulation and Modeling
                Algorithms
                Evolutionary Algorithms
                Research and Analysis Methods
                Computational Techniques
                Evolutionary Computation
                Evolutionary Algorithms
                Physical Sciences
                Mathematics
                Applied Mathematics
                Algorithms
                Research and Analysis Methods
                Simulation and Modeling
                Algorithms
                Physical Sciences
                Mathematics
                Optimization
                Biology and Life Sciences
                Genetics
                Genomics
                Human Genomics
                Biology and Life Sciences
                Molecular Biology
                Molecular Biology Techniques
                Genotyping
                Research and Analysis Methods
                Molecular Biology Techniques
                Genotyping
                Physical Sciences
                Mathematics
                Applied Mathematics
                Algorithms
                Genetic Algorithms
                Research and Analysis Methods
                Simulation and Modeling
                Algorithms
                Genetic Algorithms
                Custom metadata
                The data underlying the results presented in the study are available from https://www.science.org/doi/abs/10.1126/science.1105436.

                Uncategorized
                Uncategorized

                Comments

                Comment on this article