Login / Signup

A deep convolutional neural network approach for predicting phenotypes from genotypes.

Wenlong MaZhixu QiuJie SongJiajia LiQian ChengJingjing ZhaiChuang Ma
Published in: Planta (2018)
Deep learning is a promising technology to accurately select individuals with high phenotypic values based on genotypic data. Genomic selection (GS) is a promising breeding strategy by which the phenotypes of plant individuals are usually predicted based on genome-wide markers of genotypes. In this study, we present a deep learning method, named DeepGS, to predict phenotypes from genotypes. Using a deep convolutional neural network, DeepGS uses hidden variables that jointly represent features in genotypes when making predictions; it also employs convolution, sampling and dropout strategies to reduce the complexity of high-dimensional genotypic data. We used a large GS dataset to train DeepGS and compared its performance with other methods. The experimental results indicate that DeepGS can be used as a complement to the commonly used RR-BLUP in the prediction of phenotypes from genotypes. The complementarity between DeepGS and RR-BLUP can be utilized using an ensemble learning approach for more accurately selecting individuals with high phenotypic values, even for the absence of outlier individuals and subsets of genotypic markers. The source codes of DeepGS and the ensemble learning approach have been packaged into Docker images for facilitating their applications in different GS programs.
Keyphrases
  • convolutional neural network
  • deep learning
  • artificial intelligence
  • genome wide
  • big data
  • machine learning
  • electronic health record
  • public health
  • dna methylation
  • copy number
  • gene expression
  • high speed
  • data analysis