Login / Signup

Partially linear monotone methods with automatic variable selection and monotonicity direction discovery.

Solveig EngebretsenIngrid K Glad
Published in: Statistics in medicine (2020)
In many statistical regression and prediction problems, it is reasonable to assume monotone relationships between certain predictor variables and the outcome. Genomic effects on phenotypes are, for instance, often assumed to be monotone. However, in some settings, it may be reasonable to assume a partially linear model, where some of the covariates can be assumed to have a linear effect. One example is a prediction model using both high-dimensional gene expression data, and low-dimensional clinical data, or when combining continuous and categorical covariates. We study methods for fitting the partially linear monotone model, where some covariates are assumed to have a linear effect on the response, and some are assumed to have a monotone (potentially nonlinear) effect. Most existing methods in the literature for fitting such models are subject to the limitation that they have to be provided the monotonicity directions a priori for the different monotone effects. We here present methods for fitting partially linear monotone models which perform both automatic variable selection, and monotonicity direction discovery. The proposed methods perform comparably to, or better than, existing methods, in terms of estimation, prediction, and variable selection performance, in simulation experiments in both classical and high-dimensional data settings.
Keyphrases
  • gene expression
  • electronic health record
  • small molecule
  • mental health
  • systematic review
  • big data
  • machine learning
  • deep learning
  • neural network
  • high throughput
  • dna methylation
  • data analysis