HelPredictor models single-cell transcriptome to predict human embryo lineage allocation.
Pengfei LiangLei ZhengChunshen LongWuritu YangLei YangYongchun ZuoPublished in: Briefings in bioinformatics (2022)
The in-depth understanding of cellular fate decision of human preimplantation embryos has prompted investigations on how changes in lineage allocation, which is far from trivial and remains a time-consuming task by experimental methods. It is desirable to develop a novel effective bioinformatics strategy to consider transitions of coordinated embryo lineage allocation and stage-specific patterns. There are rapidly growing applications of machine learning models to interpret complex datasets for identifying candidate development-related factors and lineage-determining molecular events. Here we developed the first machine learning platform, HelPredictor, that integrates three feature selection methods, namely, principal components analysis, F-score algorithm and squared coefficient of variation, and four classical machine learning classifiers that different combinations of methods and classifiers have independent outputs by increment feature selection method. With application to single-cell sequencing data of human embryo, HelPredictor not only achieved 94.9% and 90.9% respectively with cross-validation and independent test, but also fast classified different embryonic lineages and their development trajectories using less HelPredictor-predicted factors. The above-mentioned candidate lineage-specific genes were discussed in detail and were clustered for exploring transitions of embryonic heterogeneity. Our tool can fast and efficiently reveal potential lineage-specific and stage-specific biomarkers and provide insights into how advanced computational tools contribute to development research. The source code is available at https://github.com/liameihao/HelPredictor.