CBP60-DB: An AlphaFold-predicted plant kingdom-wide database of the CALMODULIN-BINDING PROTEIN 60 protein family with a novel structural clustering algorithm.
Keaun AmaniVanessa ShivnauthChristian Danve M CastroverdePublished in: Plant direct (2023)
Molecular genetic analyses in the model species Arabidopsis thaliana have demonstrated the major roles of different CALMODULIN-BINDING PROTEIN 60 (CBP60) proteins in growth, stress signaling, and immune responses. Prominently, CBP60g and SARD1 are paralogous CBP60 transcription factors that regulate numerous components of the immune system, such as cell surface and intracellular immune receptors, MAP kinases, WRKY transcription factors, and biosynthetic enzymes for immunity-activating metabolites salicylic acid (SA) and N -hydroxypipecolic acid (NHP). However, their function, regulation, and diversification in most species remain unclear. Here, we have created CBP60-DB (https://cbp60db.wlu.ca/), a structural and bioinformatic database that comprehensively characterized 1052 CBP60 gene homologs (encoding 2376 unique transcripts and 1996 unique proteins) across 62 phylogenetically diverse genomes in the plant kingdom. We have employed deep learning-predicted structural analyses using AlphaFold2 and then generated dedicated web pages for all plant CBP60 proteins. Importantly, we have generated a novel clustering visualization algorithm to interrogate kingdom-wide structural similarities for more efficient inference of conserved functions across various plant taxa. Because well-characterized CBP60 proteins in Arabidopsis are known to be transcription factors with putative calmodulin-binding domains, we have integrated external bioinformatic resources to analyze protein domains and motifs. Collectively, we present a plant kingdom-wide identification of this important protein family in a user-friendly AlphaFold-anchored database, representing a novel and significant resource for the broader plant biology community.
Keyphrases
- transcription factor
- protein protein
- binding protein
- deep learning
- dna binding
- immune response
- cell wall
- machine learning
- arabidopsis thaliana
- genome wide identification
- healthcare
- cell surface
- mental health
- emergency department
- signaling pathway
- plant growth
- ms ms
- neural network
- single molecule
- stress induced
- amino acid