Accuracies of genomic predictions for disease resistance of striped catfish to <i>Edwardsiella ictaluri</i> using artificial intelligence algorithms

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Assessments of genomic prediction accuracies using artificial intelligent (AI) algorithms ( i.e., machine and deep learning methods) are currently not available or very limited in aquaculture species. The principal aim of this study was to examine the predictive performance of these new methods for disease resistance to Edwardsiella ictaluri in a population of striped catfish Pangasianodon hypophthalmus and to make comparisons with four common methods, i.e., pedigree-based best linear unbiased prediction (PBLUP), genomic-based best linear unbiased prediction (GBLUP), single-step GBLUP (ssGBLUP) and a nonlinear Bayesian approach (notably BayesR). Our analyses using machine learning ( i.e., ML-KAML) and deep learning ( i.e., DL-MLP and DL-CNN) together with the four common methods (PBLUP, GBLUP, ssGBLUP, and BayesR) were conducted for two main disease resistance traits ( i.e., survival status coded as 0 and 1 and survival time, i.e., days that the animals were still alive after the challenge test) in a pedigree consisting of 560 individual animals (490 offspring and 70 parents) genotyped for 14,154 single nucleotide polymorphism (SNPs). The results using 6,470 SNPs after quality control showed that machine learning methods outperformed PBLUP, GBLUP, and ssGBLUP, with the increases in the prediction accuracies for both traits by 9.1–15.4%. However, the prediction accuracies obtained from machine learning methods were comparable to those estimated using BayesR. Imputation of missing genotypes using AlphaFamImpute increased the prediction accuracies by 5.3–19.2% in all the methods and data used. On the other hand, there were insignificant decreases (0.3–5.6%) in the prediction accuracies for both survival status and survival time when multivariate models were used in comparison to univariate analyses. Interestingly, the genomic prediction accuracies based on only highly significant SNPs ( P < 0.00001, 318–400 SNPs for survival status and 1,362–1,589 SNPs for survival time) were somewhat lower (0.3–15.6%) than those obtained from the whole set of 6,470 SNPs. In most of our analyses, the accuracies of genomic prediction were somewhat higher for survival time than survival status (0/1 data). It is concluded that although there are prospects for the application of genomic selection to increase disease resistance to E. ictaluri in striped catfish breeding programs, further evaluation of these methods should be made in independent families/populations when more data are accumulated in future generations to avoid possible biases in the genetic parameters estimates and prediction accuracies for the disease-resistant traits studied in this population of striped catfish P. hypophthalmus.

Related collections

Most cited references 72

Record: found
Abstract: found
Article: not found

PLINK: a tool set for whole-genome association and population-based linkage analyses.

Shaun Purcell, Benjamin M. Neale, Kathe Todd-Brown … (2007)

Whole-genome association studies (WGAS) bring new computational, as well as analytic, challenges to researchers. Many existing genetic-analysis tools are not designed to handle such large data sets in a convenient manner and do not necessarily exploit the new opportunities that whole-genome data bring. To address these issues, we developed PLINK, an open-source C/C++ WGAS tool set. With PLINK, large data sets comprising hundreds of thousands of markers genotyped for thousands of individuals can be rapidly manipulated and analyzed in their entirety. As well as providing tools to make the basic analytic steps computationally efficient, PLINK also supports some novel approaches to whole-genome data that take advantage of whole-genome coverage. We introduce PLINK and describe the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation. In particular, we focus on the estimation and use of identity-by-state and identity-by-descent information in the context of population-based whole-genome studies. This information can be used to detect and correct for population stratification and to identify extended chromosomal segments that are shared identical by descent between very distantly related individuals. Analysis of the patterns of segmental sharing has the potential to map disease loci that contain multiple rare variants in a population-based linkage analysis.

0 comments Cited 5714 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research.

A Conesa, S Götz, J. M. García-Gómez … (2005)

We present here Blast2GO (B2G), a research tool designed with the main purpose of enabling Gene Ontology (GO) based data mining on sequence data for which no GO annotation is yet available. B2G joints in one application GO annotation based on similarity searches with statistical analysis and highlighted visualization on directed acyclic graphs. This tool offers a suitable platform for functional genomics research in non-model species. B2G is an intuitive and interactive desktop application that allows monitoring and comprehension of the whole annotation and analysis process. Blast2GO is freely available via Java Web Start at http://www.blast2go.de. http://www.blast2go.de -> Evaluation.

0 comments Cited 1302 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Efficient methods to compute genomic predictions.

P VanRaden (2008)

Efficient methods for processing genomic data were developed to increase reliability of estimated breeding values and to estimate thousands of marker effects simultaneously. Algorithms were derived and computer programs tested with simulated data for 2,967 bulls and 50,000 markers distributed randomly across 30 chromosomes. Estimation of genomic inbreeding coefficients required accurate estimates of allele frequencies in the base population. Linear model predictions of breeding values were computed by 3 equivalent methods: 1) iteration for individual allele effects followed by summation across loci to obtain estimated breeding values, 2) selection index including a genomic relationship matrix, and 3) mixed model equations including the inverse of genomic relationships. A blend of first- and second-order Jacobi iteration using 2 separate relaxation factors converged well for allele frequencies and effects. Reliability of predicted net merit for young bulls was 63% compared with 32% using the traditional relationship matrix. Nonlinear predictions were also computed using iteration on data and nonlinear regression on marker deviations; an additional (about 3%) gain in reliability for young bulls increased average reliability to 66%. Computing times increased linearly with number of genotypes. Estimation of allele frequencies required 2 processor days, and genomic predictions required <1 d per trait, and traits were processed in parallel. Information from genotyping was equivalent to about 20 daughters with phenotypic records. Actual gains may differ because the simulation did not account for linkage disequilibrium in the base population or selection in subsequent generations.

0 comments Cited 1152 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

R Houston: Role: Editor

Journal

Journal ID (nlm-ta): G3 (Bethesda)

Journal ID (iso-abbrev): Genetics

Journal ID (publisher-id): g3journal

Title: G3: Genes|Genomes|Genetics

Publisher: Oxford University Press

ISSN (Electronic): 2160-1836

Publication date Collection: January 2022

Publication date (Electronic): 22 October 2021

Publication date PMC-release: 22 October 2021

Volume: 12

Issue: 1

Electronic Location Identifier: jkab361

Affiliations

[1 ] School of Science, Technology and Engineering, University of the Sunshine Coast , Sippy Downs, QLD, Australia

[2 ] Genecology Research Center, University of the Sunshine Coast , Sippy Downs, QLD, Australia

[3 ] Research Institute for Aquaculture No.2 , Ho Chi Minh 710000, Vietnam

[4 ] Institute of Genome Research, Vietnam Academy of Science and Technology , Hanoi, Vietnam

[5 ] Vietnam National University of Agriculture , Gia Lam 131000, Vietnam

Author notes

Corresponding author: vunt.ria2@mard.gov.vn or ThanhVu.Nguyen@ 123456research.usc.edu.au (N.T.); nnguyen@ 123456usc.edu.au (N.H.)

Author information

Nguyen Thanh Vu https://orcid.org/0000-0003-0236-0221

Nguyen Hong Nguyen https://orcid.org/0000-0002-4143-955X

Article

Publisher ID: jkab361

DOI: 10.1093/g3journal/jkab361

PMC ID: 8727988

PubMed ID: 34788431

SO-VID: 8e1d4a0f-85a5-403f-b0f5-efc77d951c19

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

History

Date received : 05 August 2021

Date accepted : 10 October 2021

Date: 11 November 2021

Page count

Pages: 13

Funding

Funded by: Ministry of Agriculture and Rural Development (MARD);

Funded by: Breeding for disease resistance to Bacillary Necrosis of Pangasius for striped catfish;

Funded by: University of the Sunshine Coast, DOI 10.13039/501100001796;

Comments

Comment on this article

scite_

Cited by 3

See all cited by

Most referenced authors 2,277

See all reference authors

Why publish your research Open Access with G3: Genes|Genomes|Genetics?

Learn more and submit today!

Accuracies of genomic predictions for disease resistance of striped catfish to Edwardsiella ictaluri using artificial intelligence algorithms

Read this article at

Abstract

Related collections

G3: Genes|Genomes|Genetics

Most cited references 72

PLINK: a tool set for whole-genome association and population-based linkage analyses.

Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research.

Efficient methods to compute genomic predictions.

Author and article information

Contributors

Journal

Affiliations

Author notes

Author information

Article

History

Page count

Funding

Categories

Comments

Comment on this article

Similar content 617

Cited by 3

Most referenced authors 2,277