Breaking the 80:20 rule in health research using large administrative data sets.
Shelly VikJudy SeidelChristopher SmithDeborah A MarshallPublished in: Health informatics journal (2023)
Objective: To explore the application of online analytic processing (OLAP) to improve the efficiency of analytics using large administrative health data sets. Methods: 18 years of administrative health data (1994/95 to 2012/13) were obtained from the Alberta Ministry of Health in Canada. The data sets included hospitalization, ambulatory care and practitioner claims data. Reference files were obtained that provided information including patient demographics, resident postal code, facility, and provider details. Population counts and projections for each year, sex, age were included for rate calculations. These sources were used to develop a data cube using OLAP tools. Results: Time required for analyses was reduced to 5% of that required when comparing run-time for simple queries that did not require linkage of data sets. The data cube negated the need for many intermediary steps for data extraction and analyses for research activities. Conventional methods required over 250 GB of server space for multiple analytic subsets, compared to only 10.3 GB for the data cube. Conclusions: Cross-training in information technology and health analytics is recommended to provide capacity to better leverage OLAP tools which are available with many common applications.