Characterizing proteomic and transcriptomic features of missense variants in amyotrophic lateral sclerosis genes.
Allison Ann DilliottSeulki KwonGuy A RouleauSumaiya IqbalSali M K FarhanPublished in: Brain : a journal of neurology (2023)
Within recent years, there has been a growing number of genes associated with amyotrophic lateral sclerosis (ALS), resulting in an increasing number of novel variants, particularly missense variants, many of which are of unknown clinical significance. Here, we leverage the sequencing efforts of the ALS Knowledge Portal (3,864 individuals with ALS and 7,839 controls) and Project MinE ALS Sequencing Consortium (4,366 individuals with ALS and 1,832 controls) to perform proteomic and transcriptomic characterization of missense variants in 24 ALS-associated genes. The two sequencing datasets were interrogated for missense variants in the 24 genes, and variants were annotated with genomic database minor allele frequencies, ClinVar pathogenicity classifications, protein sequence features including Uniprot functional site annotations and PhosphoSitePlus post-translational modification (PTM) site annotations, structural features from AlphaFold predicted monomeric 3D structures, and transcriptomic expression levels from Genotype-Tissue Expression (GTEx). We then applied missense variant enrichment and gene-burden testing following binning of variation based on the selected proteomic and transcriptomic features to identify those most relevant to pathogenicity in ALS-associated genes. Using predicted human protein structures from AlphaFold, we determined that missense variants carried by individuals with ALS were significantly enriched in β-sheets and α-helices, as well as in core, buried, or moderately buried regions. At the same time, we identified that hydrophobic amino acid residues, compositionally biased protein regions and protein-protein interaction regions are predominantly enriched in missense variants carried by individuals with ALS. Assessment of expression level based on transcriptomics also revealed enrichment of variants of high and medium expression across all tissues and within the brain. We further explored enriched features of interest using burden analyses and identified individual genes were indeed driving certain enrichment signals. A case study is presented for SOD1 to demonstrate proof of concept of how enriched features may aid in defining variant pathogenicity. Our results present proteomic and transcriptomic features that are important indicators of missense variant pathogenicity in ALS and are distinct from features associated with neurodevelopmental disorders.
Keyphrases
- amyotrophic lateral sclerosis
- copy number
- single cell
- genome wide
- intellectual disability
- protein protein
- poor prognosis
- amino acid
- rna seq
- genome wide identification
- binding protein
- healthcare
- dna methylation
- gene expression
- endothelial cells
- autism spectrum disorder
- biofilm formation
- bioinformatics analysis
- high resolution
- staphylococcus aureus
- label free
- emergency department
- pseudomonas aeruginosa
- brain injury
- genome wide analysis
- multiple sclerosis
- subarachnoid hemorrhage
- blood brain barrier