Classification algorithm for high-dimensional protein markers in time-course data.
Gajendra K VishwakarmaAtanu BhattacharjeeSouvik BanerjeeBenoit LiquetPublished in: Statistics in medicine (2020)
Identification of biomarkers is an emerging area in oncology. In this article, we develop an efficient statistical procedure for the classification of protein markers according to their effect on cancer progression. A high-dimensional time-course dataset of protein markers for 80 patients motivates us for developing the model. The threshold value is formulated as a level of a marker having maximum impact on cancer progression. The classification algorithm technique for high-dimensional time-course data is developed and the algorithm is validated by comparing random components using both proportional hazard and accelerated failure time frailty models. The study elucidates the application of two separate joint modeling techniques using auto regressive-type model and mixed effect model for time-course data and proportional hazard model for survival data with proper utilization of Bayesian methodology. Also, a prognostic score is developed on the basis of few selected genes with application on patients. This study facilitates to identify relevant biomarkers from a set of markers.
Keyphrases
- machine learning
- deep learning
- end stage renal disease
- big data
- newly diagnosed
- electronic health record
- ejection fraction
- chronic kidney disease
- papillary thyroid
- peritoneal dialysis
- prognostic factors
- artificial intelligence
- neural network
- protein protein
- small molecule
- dna methylation
- young adults
- transcription factor
- squamous cell
- childhood cancer