Cue: a deep-learning framework for structural variant discovery and genotyping.
Victoria PopicChris RohlicekFabio CunialIman HajirasoulihaDmitry MeleshkoKiran GarimellaAnant MaheshwariPublished in: Nature methods (2023)
Structural variants (SVs) are a major driver of genetic diversity and disease in the human genome and their discovery is imperative to advances in precision medicine. Existing SV callers rely on hand-engineered features and heuristics to model SVs, which cannot scale to the vast diversity of SVs nor fully harness the information available in sequencing datasets. Here we propose an extensible deep-learning framework, Cue, to call and genotype SVs that can learn complex SV abstractions directly from the data. At a high level, Cue converts alignments to images that encode SV-informative signals and uses a stacked hourglass convolutional neural network to predict the type, genotype and genomic locus of the SVs captured in each image. We show that Cue outperforms the state of the art in the detection of several classes of SVs on synthetic and real short-read data and that it can be easily extended to other sequencing platforms, while achieving competitive performance.
Keyphrases
- deep learning
- convolutional neural network
- genetic diversity
- artificial intelligence
- high throughput
- machine learning
- small molecule
- big data
- single cell
- electronic health record
- endothelial cells
- copy number
- genome wide
- gene expression
- rna seq
- single molecule
- quantum dots
- pluripotent stem cells
- loop mediated isothermal amplification