Multivariate Data Analysis Methodology to Solve Data Challenges Related to Scale-Up Model Validation and Missing Data on a Micro-Bioreactor System.
Stephen GoldrickViktor SandnerMatthew CheeksRichard TurnerSuzanne S FaridGraham McCreathJarka GlasseyPublished in: Biotechnology journal (2019)
Multivariate data analysis (MVDA) is a highly valuable and significantly underutilized resource in biomanufacturing. It offers the opportunity to enhance understanding and leverage useful information from complex high-dimensional data sets, recorded throughout all stages of therapeutic drug manufacture. To help standardize the application and promote this resource within the biopharmaceutical industry, this paper outlines a novel MVDA methodology describing the necessary steps for efficient and effective data analysis. The MVDA methodology is followed to solve two case studies: a "small data" and a "big data" challenge. In the "small data" example, a large-scale data set is compared to data from a scale-down model. This methodology enables a new quantitative metric for equivalence to be established by combining a two one-sided test with principal component analysis. In the "big data" example, this methodology enables accurate predictions of critical missing data essential to a cloning study performed in the ambr15 system. These predictions are generated by exploiting the underlying relationship between the off-line missing values and the on-line measurements through the generation of a partial least squares model. In summary, the proposed MVDA methodology highlights the importance of data pre-processing, restructuring, and visualization during data analytics to solve complex biopharmaceutical challenges.