Combining gene expression profiling and machine learning to diagnose B-cell non-Hodgkin lymphoma.
Victor BobéeFanny DrieuxVinciane MarchandVincent SaterLiana VeresezanJean-Michel PicquenotPierre-Julien ViaillyMarie-Delphine LanicMathieu ViennotElodie BohersLucie ObericChristiane Copie-BergmanThierry Jo MolinaPhilippe GaulardCorinne HaiounGilles Andre SallesHervé TillyFabrice JardinPhilippe RuminyPublished in: Blood cancer journal (2020)
Non-Hodgkin B-cell lymphomas (B-NHLs) are a highly heterogeneous group of mature B-cell malignancies. Their classification thus requires skillful evaluation by expert hematopathologists, but the risk of error remains higher in these tumors than in many other areas of pathology. To facilitate diagnosis, we have thus developed a gene expression assay able to discriminate the seven most frequent B-cell NHL categories. This assay relies on the combination of ligation-dependent RT-PCR and next-generation sequencing, and addresses the expression of more than 130 genetic markers. It was designed to retrieve the main gene expression signatures of B-NHL cells and their microenvironment. The classification is handled by a random forest algorithm which we trained and validated on a large cohort of more than 400 annotated cases of different histology. Its clinical relevance was verified through its capacity to prevent important misclassification in low grade lymphomas and to retrieve clinically important characteristics in high grade lymphomas including the cell-of-origin signatures and the MYC and BCL2 expression levels. This accurate pan-B-NHL predictor, which allows a systematic evaluation of numerous diagnostic and prognostic markers, could thus be proposed as a complement to conventional histology to guide the management of patients and facilitate their stratification into clinical trials.
Keyphrases
- gene expression
- machine learning
- low grade
- high grade
- deep learning
- dna methylation
- poor prognosis
- genome wide
- clinical trial
- artificial intelligence
- single cell
- high throughput
- induced apoptosis
- stem cells
- climate change
- big data
- copy number
- binding protein
- randomized controlled trial
- oxidative stress
- long non coding rna
- transcription factor
- phase ii
- cell death
- cell proliferation
- signaling pathway
- hodgkin lymphoma
- study protocol
- double blind