Enhanced performance of gene expression predictive models with protein-mediated spatial chromatin interactions.
Mateusz ChilinskiJakub LipińskiAbhishek AgarwalYijun RuanDariusz PlewczyńskiPublished in: bioRxiv : the preprint server for biology (2023)
There have been multiple attempts to predict the expression of the genes based on the sequence, epigenetics, and various other factors. To improve those predictions, we have decided to investigate adding protein-specific 3D interactions that play a major role in the compensation of the chromatin structure in the cell nucleus. To achieve this, we have used the architecture of one of the state-of-the-art algorithms, ExPecto (J. Zhou et al., 2018), and investigated the changes in the model metrics upon adding the spatially relevant data. We have used ChIA-PET interactions that are mediated by cohesin (24 cell lines), CTCF (4 cell lines), and RNAPOL2 (4 cell lines). As the output of the study, we have developed the Spatial Gene Expression (SpEx) algorithm that shows statistically significant improvements in most cell lines.
Keyphrases
- gene expression
- dna methylation
- machine learning
- genome wide
- amino acid
- binding protein
- poor prognosis
- computed tomography
- transcription factor
- single cell
- protein protein
- pet ct
- electronic health record
- small molecule
- stem cells
- bone marrow
- mass spectrometry
- long non coding rna
- cell therapy
- positron emission tomography
- bioinformatics analysis