BOLD: Blood-gas and Oximetry Linked Dataset - Open Source Research.
João MatosTristan StrujaJack GallifantLuis Filipe NakayamaMarie-Laure CharpignonXiaoli LiuNicoleta Economou-ZavlanosJaime Dos Santos CardosoKimberly S JohnsonNrupen BhavsarJudy GichoyaLeo Anthony CeliAn-Kwok Ian WongPublished in: medRxiv : the preprint server for health sciences (2023)
Pulse oximeters measure peripheral arterial oxygen saturation (SpO 2 ) noninvasively, while the gold standard (SaO 2 ) involves arterial blood gas measurement. There are known racial and ethnic disparities in their performance. BOLD is a new comprehensive dataset that aims to underscore the importance of addressing biases in pulse oximetry accuracy, which disproportionately affect darker-skinned patients. The dataset was created by harmonizing three Electronic Health Record databases (MIMIC-III, MIMIC-IV, eICU-CRD) comprising Intensive Care Unit stays of US patients. Paired SpO 2 and SaO 2 measurements were time-aligned and combined with various other sociodemographic and parameters to provide a detailed representation of each patient. BOLD includes 49,099 paired measurements, within a 5-minute window and with oxygen saturation levels between 70-100%. Minority racial and ethnic groups account for ∼25% of the data - a proportion seldom achieved in previous studies. The codebase is publicly available. Given the prevalent use of pulse oximeters in the hospital and at home, we hope that BOLD will be leveraged to develop debiasing algorithms that can result in more equitable healthcare solutions.
Keyphrases
- healthcare
- end stage renal disease
- electronic health record
- intensive care unit
- newly diagnosed
- ejection fraction
- chronic kidney disease
- resting state
- blood pressure
- prognostic factors
- peritoneal dialysis
- machine learning
- functional connectivity
- emergency department
- big data
- deep learning
- room temperature
- case report
- health insurance
- adverse drug
- case control