FusionGDB 2.0: fusion gene annotation updates aided by deep learning.
Pora KimHua TanJiajia LiuHaeseung LeeHyesoo JungHimanshu KumarXiaobo ZhouPublished in: Nucleic acids research (2021)
A knowledgebase of the systematic functional annotation of fusion genes is critical for understanding genomic breakage context and developing therapeutic strategies. FusionGDB is a unique functional annotation database of human fusion genes and has been widely used for studies with diverse aims. In this study, we report fusion gene annotation updates aided by deep learning (FusionGDB 2.0) available at https://compbio.uth.edu/FusionGDB2/. FusionGDB 2.0 has substantial updates of contents such as up-to-date human fusion genes, fusion gene breakage tendency score with FusionAI deep learning model based on 20 kb DNA sequence around BP, investigation of overlapping between fusion breakpoints with 44 human genomic features across five cellular role's categories, transcribed chimeric sequence and following open reading frame analysis with coding potential based on deep learning approach with Ribo-seq read features, and rigorous investigation of the protein feature retention of individual fusion partner genes in the protein level. Among ∼102k fusion genes, about 15k kept their ORF as In-frames, which is two times compared to the previous version, FusionGDB. FusionGDB 2.0 will be used as the reference knowledgebase of fusion gene annotations. FusionGDB 2.0 provides eight categories of annotations and it will be helpful for diverse human genomic studies.
Keyphrases
- deep learning
- genome wide
- genome wide identification
- endothelial cells
- copy number
- machine learning
- dna methylation
- rna seq
- convolutional neural network
- artificial intelligence
- genome wide analysis
- emergency department
- induced pluripotent stem cells
- stem cells
- small molecule
- single cell
- risk assessment
- single molecule
- cell therapy
- human immunodeficiency virus
- hiv infected
- circulating tumor
- hiv testing
- adverse drug