Login / Signup

The developmental and evolutionary characteristics of transcription factor binding site clustered regions based on an explainable machine learning model.

Zhangyi OuyangFeng LiuWanying LiJunting WangBijia ChenYang ZhengYaru LiHuan TaoXiang XuCheng LiYuwen CongHao LiXiao-Chen BoHebing Chen
Published in: Nucleic acids research (2024)
Gene expression is temporally and spatially regulated by the interaction of transcription factors (TFs) and cis-regulatory elements (CREs). The uneven distribution of TF binding sites across the genome poses challenges in understanding how this distribution evolves to regulate spatio-temporal gene expression and consequent heritable phenotypic variation. In this study, chromatin accessibility profiles and gene expression profiles were collected from several species including mammals (human, mouse, bovine), fish (zebrafish and medaka), and chicken. Transcription factor binding sites clustered regions (TFCRs) at different embryonic stages were characterized to investigate regulatory evolution. The study revealed dynamic changes in TFCR distribution during embryonic development and species evolution. The synchronization between TFCR complexity and gene expression was assessed across species using RegulatoryScore. Additionally, an explainable machine learning model highlighted the importance of the distance between TFCR and promoter in the coordinated regulation of TFCRs on gene expression. Our results revealed the developmental and evolutionary dynamics of TFCRs during embryonic development from fish, chicken to mammals. These data provide valuable resources for exploring the relationship between transcriptional regulation and phenotypic differences during embryonic development.
Keyphrases
  • gene expression
  • transcription factor
  • dna methylation
  • machine learning
  • genome wide
  • dna binding
  • genome wide identification
  • big data
  • endothelial cells
  • single cell
  • dna damage
  • electronic health record