Login / Signup

Structured sparse logistic regression with application to lung cancer prediction using breath volatile biomarkers.

Xiaochen ZhangQingzhao ZhangXiaofeng WangShuangge MaKuangnan Fang
Published in: Statistics in medicine (2019)
This article is motivated by a study of lung cancer prediction using breath volatile organic compound (VOC) biomarkers, where the challenge is that the predictors include not only high-dimensional time-dependent or functional VOC features but also the time-independent clinical variables. We consider a high-dimensional logistic regression and propose two different penalties: group spline-penalty or group smooth-penalty to handle the group structures of the time-dependent variables in the model. The new methods have the advantage for the situation where the model coefficients are sparse but change smoothly within the group, compared with other existing methods such as the group lasso and the group bridge approaches. Our methods are easy to implement since they can be turned into a group minimax concave penalty problem after certain transformations. We show that our fitting algorithm possesses the descent property and leads to attractive convergence properties. The simulation studies and the lung cancer application are performed to demonstrate the accuracy and stability of the proposed approaches.
Keyphrases
  • machine learning
  • high resolution
  • mass spectrometry
  • simultaneous determination