Identifying luminal and basal mammary cell specific genes and their expression patterns during pregnancy.
Zhan Dong LiXiangtian YuZi MeiTao ZengLei ChenXian Ling XuHao LiTao HuangYu-Dong CaiPublished in: PloS one (2022)
Mammary gland is present in all mammals and usually functions in producing milk to feed the young offspring. Mammogenesis refers to the growth and development of mammary gland, which begins at puberty and ends after lactation. Pregnancy is regulated by various cytokines, which further contributes to mammary gland development. Epithelial cells, including basal and luminal cells, are one of the major components of mammary gland cells. The development of basal and luminal cells has been observed to significantly differ at different stages. However, the underlying mechanisms for differences between basal and luminal cells have not been fully studied. To explore the mechanisms underlying the differentiation of mammary progenitors or their offspring into luminal and myoepithelial cells, the single-cell sequencing data on mammary epithelia cells of virgin and pregnant mouse was deeply investigated in this work. We evaluated features by using Monte Carlo feature selection and plotted the incremental feature selection curve with support vector machine or RIPPER to find the optimal gene features and rules that can divide epithelial cells into four clusters with different cell subtypes like basal and luminal cells and different phases like pregnancy and virginity. As representations, the feature genes Cldn7, Gjb6, Sparc, Cldn3, Cited1, Krt17, Spp1, Cldn4, Gjb2 and Cldn19 might play an important role in classifying the epithelial mammary cells. Notably, seven most important rules based on the combination of cell-specific and tissue-specific expressions of feature genes effectively classify the epithelial mammary cells in a quantitative and interpretable manner.
Keyphrases
- induced apoptosis
- cell cycle arrest
- single cell
- machine learning
- endoplasmic reticulum stress
- adipose tissue
- oxidative stress
- rna seq
- genome wide
- skeletal muscle
- poor prognosis
- dna methylation
- cell proliferation
- mass spectrometry
- metabolic syndrome
- big data
- artificial intelligence
- monte carlo
- genome wide identification