CancerBERT: a cancer domain-specific language model for extracting breast cancer phenotypes from electronic health records.
Sicheng ZhouNan WangLiwei WangHongfang LiuRui ZhangPublished in: Journal of the American Medical Informatics Association : JAMIA (2022)
The CancerBERT models were developed to extract the cancer phenotypes in clinical notes and pathology reports. The results validated that using customized vocabulary may further improve the performances of domain specific BERT models in clinical NLP tasks. The CancerBERT models developed in the study would further help clinical decision support.