Prediction accuracies for growth and wood attributes of interior spruce in space using genotyping-by-sequencing

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Background

Genomic selection (GS) in forestry can substantially reduce the length of breeding cycle and increase gain per unit time through early selection and greater selection intensity, particularly for traits of low heritability and late expression. Affordable next-generation sequencing technologies made it possible to genotype large numbers of trees at a reasonable cost.

Results

Genotyping-by-sequencing was used to genotype 1,126 Interior spruce trees representing 25 open-pollinated families planted over three sites in British Columbia, Canada. Four imputation algorithms were compared (mean value (MI), singular value decomposition (SVD), expectation maximization (EM), and a newly derived, family-based k-nearest neighbor (kNN-Fam)). Trees were phenotyped for several yield and wood attributes. Single- and multi-site GS prediction models were developed using the Ridge Regression Best Linear Unbiased Predictor (RR-BLUP) and the Generalized Ridge Regression (GRR) to test different assumption about trait architecture. Finally, using PCA, multi-trait GS prediction models were developed. The EM and kNN-Fam imputation methods were superior for 30 and 60% missing data, respectively. The RR-BLUP GS prediction model produced better accuracies than the GRR indicating that the genetic architecture for these traits is complex. GS prediction accuracies for multi-site were high and better than those of single-sites while multi-site predictability produced the lowest accuracies reflecting type-b genetic correlations and deemed unreliable. The incorporation of genomic information in quantitative genetics analyses produced more realistic heritability estimates as half-sib pedigree tended to inflate the additive genetic variance and subsequently both heritability and gain estimates. Principle component scores as representatives of multi-trait GS prediction models produced surprising results where negatively correlated traits could be concurrently selected for using PCA2 and PCA3.

Conclusions

The application of GS to open-pollinated family testing, the simplest form of tree improvement evaluation methods, was proven to be effective. Prediction accuracies obtained for all traits greatly support the integration of GS in tree breeding. While the within-site GS prediction accuracies were high, the results clearly indicate that single-site GS models ability to predict other sites are unreliable supporting the utilization of multi-site approach. Principle component scores provided an opportunity for the concurrent selection of traits with different phenotypic optima.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1597-y) contains supplementary material, which is available to authorized users.

Related collections

Most cited references 50

Record: found
Abstract: found
Article: not found

Mapping genes for complex traits in domestic animals and their use in breeding programmes.

Michael E Goddard, Ben J. Hayes (2009)

Genome-wide panels of SNPs have recently been used in domestic animal species to map and identify genes for many traits and to select genetically desirable livestock. This has led to the discovery of the causal genes and mutations for several single-gene traits but not for complex traits. However, the genetic merit of animals can still be estimated by genomic selection, which uses genome-wide SNP panels as markers and statistical methods that capture the effects of large numbers of SNPs simultaneously. This approach is expected to double the rate of genetic improvement per year in many livestock systems.

0 comments Cited 356 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Missing value estimation methods for DNA microarrays.

Annette Hastie, Allison Altman, John P. Brown … (2001)

Gene expression microarray experiments can generate data sets with multiple missing expression values. Unfortunately, many algorithms for gene expression analysis require a complete matrix of gene array values as input. For example, methods such as hierarchical clustering and K-means clustering are not robust to missing data, and may lose effectiveness even with a few missing values. Methods for imputing missing data are needed, therefore, to minimize the effect of incomplete data sets on analyses, and to increase the range of data sets to which these algorithms can be applied. In this report, we investigate automated methods for estimating missing data. We present a comparative study of several methods for the estimation of missing values in gene microarray data. We implemented and evaluated three methods: a Singular Value Decomposition (SVD) based method (SVDimpute), weighted K-nearest neighbors (KNNimpute), and row average. We evaluated the methods using a variety of parameter settings and over different real data sets, and assessed the robustness of the imputation methods to the amount of missing data over the range of 1--20% missing values. We show that KNNimpute appears to provide a more robust and sensitive method for missing value estimation than SVDimpute, and both SVDimpute and KNNimpute surpass the commonly used row average method (as well as filling missing values with zeros). We report results of the comparative experiments and provide recommendations and tools for accurate estimation of missing microarray data under a variety of conditions.

0 comments Cited 311 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Increased accuracy of artificial selection by using the realized relationship matrix.

B. Hayes, P M Visscher, M. E. Goddard (2009)

Dense marker genotypes allow the construction of the realized relationship matrix between individuals, with elements the realized proportion of the genome that is identical by descent (IBD) between pairs of individuals. In this paper, we demonstrate that by replacing the average relationship matrix derived from pedigree with the realized relationship matrix in best linear unbiased prediction (BLUP) of breeding values, the accuracy of the breeding values can be substantially increased, especially for individuals with no phenotype of their own. We further demonstrate that this method of predicting breeding values is exactly equivalent to the genomic selection methodology where the effects of quantitative trait loci (QTLs) contributing to variation in the trait are assumed to be normally distributed. The accuracy of breeding values predicted using the realized relationship matrix in the BLUP equations can be deterministically predicted for known family relationships, for example half sibs. The deterministic method uses the effective number of independently segregating loci controlling the phenotype that depends on the type of family relationship and the length of the genome. The accuracy of predicted breeding values depends on this number of effective loci, the family relationship and the number of phenotypic records. The deterministic prediction demonstrates that the accuracy of breeding values can approach unity if enough relatives are genotyped and phenotyped. For example, when 1000 full sibs per family were genotyped and phenotyped, and the heritability of the trait was 0.5, the reliability of predicted genomic breeding values (GEBVs) for individuals in the same full sib family without phenotypes was 0.82. These results were verified by simulation. A deterministic prediction was also derived for random mating populations, where the effective population size is the key parameter determining the effective number of independently segregating loci. If the effective population size is large, a very large number of individuals must be genotyped and phenotyped in order to accurately predict breeding values for unphenotyped individuals from the same population. If the heritability of the trait is 0.3, and N(e)=100, approximately 12474 individuals with genotypes and phenotypes are required in order to predict GEBVs of un-phenotyped individuals in the same population with an accuracy of 0.7 [corrected].

0 comments Cited 245 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Omnia Gamal El-Dien: omnia.gamal@alumni.ubc.ca

Blaise Ratcliffe: b.ratcliffe@gmail.com

Jaroslav Klápště: klapste.j@gmail.com

Charles Chen: charles.chen@okstate.edu

Ilga Porth: porth@mail.ubc.ca

Yousry A El-Kassaby: y.el-kassaby@ubc.ca

Journal

Journal ID (nlm-ta): BMC Genomics

Journal ID (iso-abbrev): BMC Genomics

Title: BMC Genomics

Publisher: BioMed Central (London )

ISSN (Electronic): 1471-2164

Publication date (Electronic): 9 May 2015

Publication date PMC-release: 9 May 2015

Publication date Collection: 2015

Volume: 16

Issue: 1

Electronic Location Identifier: 370

Affiliations

[ ]Department of Forest and Conservation Sciences, Faculty of Forestry, The University of British Columbia, 2424 Main Mall, Vancouver, British Columbia V6T 1Z4 Canada

[ ]Department of Genetics and Physiology of Forest Trees, Faculty of Forestry and Wood Sciences, Czech University of Life Sciences Prague, Kamycka 129, 165 21 Prague 6, Czech Republic

[ ]Department of Biochemistry and Molecular Biology, Oklahoma State University, Stillwater, OK 74078-3035 USA

Article

Publisher ID: 1597

DOI: 10.1186/s12864-015-1597-y

PMC ID: 4424896

PubMed ID: 25956247

SO-VID: a6a8b136-5315-420f-acc5-d37090a365fa

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

History

Date received : 21 January 2015

Date accepted : 28 April 2015

Custom metadata

ScienceOpen disciplines: Genetics

Keywords: interior spruce,genomic selection,genotyping-by-sequencing,open-pollinated families,genotype x environment interaction,imputation methods,multi-trait gs

Data availability:

ScienceOpen disciplines: Genetics

Keywords: interior spruce, genomic selection, genotyping-by-sequencing, open-pollinated families, genotype x environment interaction, imputation methods, multi-trait gs

Prediction accuracies for growth and wood attributes of interior spruce in space using genotyping-by-sequencing

Read this article at

Abstract

Background

Results

Conclusions

Electronic supplementary material

Related collections

Genome Engineering using CRISPR

Most cited references 50

Mapping genes for complex traits in domestic animals and their use in breeding programmes.

Missing value estimation methods for DNA microarrays.

Increased accuracy of artificial selection by using the realized relationship matrix.

Author and article information

Contributors

Journal

Affiliations

Article

History

Categories

Custom metadata

Comments

Comment on this article

Similar content 50

Cited by 46

Most referenced authors 945