Login / Signup

GWAS with Heterogeneous Data: Estimating the Fraction of Phenotypic Variation Mediated by Gene Expression Data.

Eriko SasakiFlorian FrommletMagnus Nordborg
Published in: G3 (Bethesda, Md.) (2018)
Intermediate phenotypes such as gene expression values can be used to elucidate the mechanisms by which genetic variation causes phenotypic variation, but jointly analyzing such heterogeneous data are far from trivial. Here we extend a so-called mediation model to handle the confounding effects of genetic background, and use it to analyze flowering time variation in Arabidopsis thaliana, focusing in particular on the central role played by the key regulator FLOWERING TIME LOCUS C (FLC). FLC polymorphism and FLC expression are both strongly correlated with flowering time variation, but the effect of the former is only partly mediated through the latter. Furthermore, the latter also reflects genetic background effects. We demonstrate that it is possible to partition these effects, shedding light on the complex regulatory network that underlies flowering time variation.
Keyphrases
  • arabidopsis thaliana
  • gene expression
  • electronic health record
  • dna methylation
  • big data
  • genome wide
  • machine learning
  • depressive symptoms
  • social support
  • binding protein
  • data analysis
  • deep learning
  • network analysis