Ovarian cancer is detectable from peripheral blood using machine learning over T-cell receptor repertoires.
Miriam Zuckerbrot-SchuldenfreiSarit Aviel-RonenAlona ZilberbergSol EfroniPublished in: Briefings in bioinformatics (2024)
The extraordinary diversity of T cells and B cells is critical for body maintenance. This diversity has an important role in protecting against tumor formation. In humans, the T-cell receptor (TCR) repertoire is generated through a striking stochastic process called V(D)J recombination, in which different gene segments are assembled and modified, leading to extensive variety. In ovarian cancer (OC), an unfortunate 80% of cases are detected late, leading to poor survival outcomes. However, when detected early, approximately 94% of patients live longer than 5 years after diagnosis. Thus, early detection is critical for patient survival. To determine whether the TCR repertoire obtained from peripheral blood is associated with tumor status, we collected blood samples from 85 women with or without OC and obtained TCR information. We then used machine learning to learn the characteristics of samples and to finally predict, over a set of unseen samples, whether the person is with or without OC. We successfully stratified the two groups, thereby associating the peripheral blood TCR repertoire with the formation of OC tumors. A careful study of the origin of the set of T cells most informative for the signature indicated the involvement of a specific invariant natural killer T (iNKT) clone and a specific mucosal-associated invariant T (MAIT) clone. Our findings here support the proposition that tumor-relevant signal is maintained by the immune system and is coded in the T-cell repertoire available in peripheral blood. It is also possible that the immune system detects tumors early enough for repertoire technologies to inform us near the beginning of tumor formation. Although such detection is made by the immune system, we might be able to identify it, using repertoire data from peripheral blood, to offer a pragmatic way to search for early signs of cancer with minimal patient burden, possibly with enhanced sensitivity.
Keyphrases
- peripheral blood
- high throughput sequencing
- regulatory t cells
- machine learning
- end stage renal disease
- ejection fraction
- newly diagnosed
- case report
- chronic kidney disease
- randomized controlled trial
- dna damage
- immune response
- clinical trial
- squamous cell carcinoma
- prognostic factors
- dendritic cells
- peritoneal dialysis
- oxidative stress
- deep learning
- ulcerative colitis
- sensitive detection
- squamous cell
- real time pcr