Login / Signup

Identification of Latent Oncogenes with a Network Embedding Method and Random Forest.

Ran ZhaoBin HuLei ChenBo Zhou
Published in: BioMed research international (2020)
Oncogene is a special type of genes, which can promote the tumor initiation. Good study on oncogenes is helpful for understanding the cause of cancers. Experimental techniques in early time are quite popular in detecting oncogenes. However, their defects become more and more evident in recent years, such as high cost and long time. The newly proposed computational methods provide an alternative way to study oncogenes, which can provide useful clues for further investigations on candidate genes. Considering the limitations of some previous computational methods, such as lack of learning procedures and terming genes as individual subjects, a novel computational method was proposed in this study. The method adopted the features derived from multiple protein networks, viewing proteins in a system level. A classic machine learning algorithm, random forest, was applied on these features to capture the essential characteristic of oncogenes, thereby building the prediction model. All genes except validated oncogenes were ranked with a measurement yielded by the prediction model. Top genes were quite different from potential oncogenes discovered by previous methods, and they can be confirmed to become novel oncogenes. It was indicated that the newly identified genes can be essential supplements for previous results.
Keyphrases
  • machine learning
  • genome wide
  • bioinformatics analysis
  • genome wide identification
  • deep learning
  • dna methylation
  • young adults
  • artificial intelligence
  • network analysis