Comparative Analysis and Phylogenetic Insights of Cas14-Homology Proteins in Bacteria and Archaea.
Numan UllahNaisu YangZhongxia GuanKuilin XiangYali WangMohamed DiabyCai ChenBo GaoChengyi SongPublished in: Genes (2023)
Type-V-F Cas12f proteins, also known as Cas14, have drawn significant interest within the diverse CRISPR-Cas nucleases due to their compact size. This study involves analyzing and comparing Cas14-homology proteins in prokaryotic genomes through mining, sequence comparisons, a phylogenetic analysis, and an array/repeat analysis. In our analysis, we identified and mined a total of 93 Cas14-homology proteins that ranged in size from 344 aa to 843 aa. The majority of the Cas14-homology proteins discovered in this analysis were found within the Firmicutes group, which contained 37 species, representing 42% of all the Cas14-homology proteins identified. In archaea, the DPANN group had the highest number of species containing Cas14-homology proteins, a total of three species. The phylogenetic analysis results demonstrate the division of Cas14-homology proteins into three clades: Cas14-A, Cas14-B, and Cas14-U. Extensive similarity was observed at the C-terminal end (CTD) through a domain comparison of the three clades, suggesting a potentially shared mechanism of action due to the presence of cutting domains in that region. Additionally, a sequence similarity analysis of all the identified Cas14 sequences indicated a low level of similarity (18%) between the protein variants. The analysis of repeats/arrays in the extended nucleotide sequences of the identified Cas14-homology proteins highlighted that 44 out of the total mined proteins possessed CRISPR-associated repeats, with 20 of them being specific to Cas14. Our study contributes to the increased understanding of Cas14 proteins across prokaryotic genomes. These homologous proteins have the potential for future applications in the mining and engineering of Cas14 proteins.