PCA-PAM50 improves consistency between breast cancer intrinsic and clinical subtyping reclassifying a subset of luminal A tumors as luminal B.
Praveen-Kumar Raj-KumarJianfang LiuJeffrey A HookeAlbert J KovatichLeonid KvecherCraig D ShriverHai HuPublished in: Scientific reports (2019)
The PAM50 classifier is widely used for breast tumor intrinsic subtyping based on gene expression. Clinical subtyping, however, is based on immunohistochemistry assays of 3-4 biomarkers. Subtype calls by these two methods do not completely match even on comparable subtypes. Nevertheless, the estrogen receptor (ER)-balanced subset for gene-centering in PAM50 subtyping, is selected based on clinical ER status. Here we present a new method called Principle Component Analysis-based iterative PAM50 subtyping (PCA-PAM50) to perform intrinsic subtyping in ER status unbalanced cohorts. This method leverages PCA and iterative PAM50 calls to derive the gene expression-based ER status and a subsequent ER-balanced subset for gene centering. Applying PCA-PAM50 to three different breast cancer study cohorts, we observed improved consistency (by 6-9.3%) between intrinsic and clinical subtyping for all three cohorts. Particularly, a more aggressive subset of luminal A (LA) tumors as evidenced by higher MKI67 gene expression and worse patient survival outcomes, were reclassified as luminal B (LB) increasing the LB subtype consistency with IHC by 25-49%. In conclusion, we show that PCA-PAM50 enhances the consistency of breast cancer intrinsic and clinical subtyping by reclassifying an aggressive subset of LA tumors into LB. PCA-PAM50 code is available at ftp://ftp.wriwindber.org/ .