Predicting Deep Learning Based Multi-Omics Parallel Integration Survival Subtypes in Lung Cancer Using Reverse Phase Protein Array Data.
Satoshi TakahashiKen AsadaKen TakasawaRyo ShimoyamaAkira SakaiAmina BolatkanNorio ShinkaiKazuma KobayashiMasaaki KomatsuSyuzo KanekoJun SeseRyuji HamamotoPublished in: Biomolecules (2020)
Mortality attributed to lung cancer accounts for a large fraction of cancer deaths worldwide. With increasing mortality figures, the accurate prediction of prognosis has become essential. In recent years, multi-omics analysis has emerged as a useful survival prediction tool. However, the methodology relevant to multi-omics analysis has not yet been fully established and further improvements are required for clinical applications. In this study, we developed a novel method to accurately predict the survival of patients with lung cancer using multi-omics data. With unsupervised learning techniques, survival-associated subtypes in non-small cell lung cancer were first detected using the multi-omics datasets from six categories in The Cancer Genome Atlas (TCGA). The new subtypes, referred to as integration survival subtypes, clearly divided patients into longer and shorter-surviving groups (log-rank test: p = 0.003) and we confirmed that this is independent of histopathological classification (Chi-square test of independence: p = 0.94). Next, an attempt was made to detect the integration survival subtypes using only one categorical dataset. Our machine learning model that was only trained on the reverse phase protein array (RPPA) could accurately predict the integration survival subtypes (AUC = 0.99). The predicted subtypes could also distinguish between high and low risk patients (log-rank test: p = 0.012). Overall, this study explores novel potentials of multi-omics analysis to accurately predict the prognosis of patients with lung cancer.
Keyphrases
- machine learning
- single cell
- deep learning
- end stage renal disease
- free survival
- chronic kidney disease
- ejection fraction
- newly diagnosed
- squamous cell carcinoma
- big data
- papillary thyroid
- risk factors
- high throughput
- artificial intelligence
- type diabetes
- gene expression
- mass spectrometry
- patient reported outcomes
- dna methylation
- data analysis
- resistance training
- protein protein
- amino acid