Login / Signup

An enriched approach to combining high-dimensional genomic and low-dimensional phenotypic data.

Javier CabreraBirol EmirGe ChengYajie DuanDemissie AlemayehuYauheniya Cherkas
Published in: Journal of biopharmaceutical statistics (2024)
We describe an approach for combining and analyzing high-dimensional genomic and low-dimensional phenotypic data. The approach leverages a scheme of weights applied to the variables instead of observations and, hence, permits incorporation of the information provided by the low dimensional data source. It can also be incorporated into commonly used downstream techniques, such as random forest or penalized regression. Finally, the simulated lupus studies involving genetic and clinical data are used to illustrate the overall idea and show that the proposed enriched penalized method can select significant genetic variables while keeping several important clinical variables in the final model.
Keyphrases
  • electronic health record
  • big data
  • copy number
  • systemic lupus erythematosus
  • genome wide
  • healthcare
  • data analysis
  • machine learning
  • social media
  • disease activity
  • health information