Login / Signup

Machine learning based disease prediction from genotype data.

Nikoletta KatsaouniAraek TashkandiLena WieseMarcel H Schulz
Published in: Biological chemistry (2021)
Using results from genome-wide association studies for understanding complex traits is a current challenge. Here we review how genotype data can be used with different machine learning (ML) methods to predict phenotype occurrence and severity from genotype data. We discuss common feature encoding schemes and how studies handle the often small number of samples compared to the huge number of variants. We compare which ML methods are being applied, including recent results using deep neural networks. Further, we review the application of methods for feature explanation and interpretation.
Keyphrases
  • machine learning
  • big data
  • neural network
  • electronic health record
  • genome wide association
  • risk assessment
  • genome wide
  • gene expression
  • copy number
  • dna methylation