Login / Signup

Detecting differentially expressed genes from RNA-seq data using fuzzy clustering.

Yuki AndoAsanao Shimokawa
Published in: The international journal of biostatistics (2024)
A two-group comparison test is generally performed on RNA sequencing data to detect differentially expressed genes (DEGs). However, the accuracy of this method is low due to the small sample size. To address this, we propose a method using fuzzy clustering that artificially generates data with expression patterns similar to those of DEGs to identify genes that are highly likely to be classified into the same cluster as the initial cluster data. The proposed method is advantageous in that it does not perform any test. Furthermore, a certain level of accuracy can be maintained even when the sample size is biased, and we show that such a situation may improve the accuracy of the proposed method. We compared the proposed method with the conventional method using simulations. In the simulations, we changed the sample size and difference between the expression levels of group 1 and group 2 in the DEGs to obtain the desired accuracy of the proposed method. The results show that the proposed method is superior in all cases under the conditions simulated. We also show that the effect of the difference between group 1 and group 2 on the accuracy is more prominent when the sample size is biased.
Keyphrases
  • rna seq
  • single cell
  • electronic health record
  • poor prognosis
  • genome wide
  • gene expression
  • transcription factor
  • machine learning
  • deep learning
  • data analysis
  • bioinformatics analysis