abc4pwm: affinity based clustering for position weight matrices in applications of DNA sequence analysis.
Omer AliAmna FarooqMingyi YangVictor X JinMagnar BjøråsJunbai WangPublished in: BMC bioinformatics (2022)
This work demonstrates applications of abc4pwm in the DNA sequence analysis for various high throughput sequencing data using ~ 1770 human TF PWMs. It recovered known TF motifs at gene promoters based on gene expression profiles (RNA-seq) and identified true TF binding targets for motifs predicted from ChIP-seq experiments. Abc4pwm is a useful tool for TF motif searching, clustering, quality assessment and integration in multiple types of sequence data analysis including RNA-seq, ChIP-seq and ATAC-seq.
Keyphrases
- rna seq
- single cell
- data analysis
- high throughput
- genome wide
- circulating tumor
- high throughput sequencing
- endothelial cells
- circulating tumor cells
- cell free
- body mass index
- single molecule
- physical activity
- weight gain
- weight loss
- machine learning
- dna methylation
- amino acid
- mass spectrometry
- induced pluripotent stem cells