Genome-wide repeat landscapes in cancer and cell-free DNA.
Akshaya V AnnapragadaNoushin NiknafsJames Robert WhiteDaniel C BruhmChristopher M CherryJamie E MedinaVilmos AdleffCarolyn HrubanDimitrios MathiosZachariah H FodaJillian A PhallenRobert B ScharpfVictor E VelculescuPublished in: Science translational medicine (2024)
Genetic changes in repetitive sequences are a hallmark of cancer and other diseases, but characterizing these has been challenging using standard sequencing approaches. We developed a de novo kmer finding approach, called ARTEMIS (Analysis of RepeaT EleMents in dISease), to identify repeat elements from whole-genome sequencing. Using this method, we analyzed 1.2 billion kmers in 2837 tissue and plasma samples from 1975 patients, including those with lung, breast, colorectal, ovarian, liver, gastric, head and neck, bladder, cervical, thyroid, or prostate cancer. We identified tumor-specific changes in these patients in 1280 repeat element types from the LINE, SINE, LTR, transposable element, and human satellite families. These included changes to known repeats and 820 elements that were not previously known to be altered in human cancer. Repeat elements were enriched in regions of driver genes, and their representation was altered by structural changes and epigenetic states. Machine learning analyses of genome-wide repeat landscapes and fragmentation profiles in cfDNA detected patients with early-stage lung or liver cancer in cross-validated and externally validated cohorts. In addition, these repeat landscapes could be used to noninvasively identify the tissue of origin of tumors. These analyses reveal widespread changes in repeat landscapes of human cancers and provide an approach for their detection and characterization that could benefit early detection and disease monitoring of patients with cancer.
Keyphrases
- genome wide
- prostate cancer
- endothelial cells
- end stage renal disease
- early stage
- dna methylation
- papillary thyroid
- machine learning
- ejection fraction
- newly diagnosed
- chronic kidney disease
- induced pluripotent stem cells
- squamous cell
- gene expression
- peritoneal dialysis
- prognostic factors
- copy number
- radical prostatectomy
- squamous cell carcinoma
- radiation therapy
- lymph node
- transcription factor
- patient reported outcomes
- loop mediated isothermal amplification
- locally advanced
- genome wide analysis