A multimodal framework for extraction and fusion of satellite images and public health data.
Dana MoukheiberDavid RestrepoSebastián Andrés CajasMaría Patricia Arbeláez MontoyaLeo Anthony CeliKuan-Ting KuoDiego M LópezLama MoukheiberMira MoukheiberSulaiman MoukheiberJuan Sebastian Osorio-ValenciaSaptarshi PurkayasthaAtika Rahman PaddoChenwei WuPo-Chih KuoPublished in: Scientific data (2024)
In low- and middle-income countries, the substantial costs associated with traditional data collection pose an obstacle to facilitating decision-making in the field of public health. Satellite imagery offers a potential solution, but the image extraction and analysis can be costly and requires specialized expertise. We introduce SatelliteBench, a scalable framework for satellite image extraction and vector embeddings generation. We also propose a novel multimodal fusion pipeline that utilizes a series of satellite imagery and metadata. The framework was evaluated generating a dataset with a collection of 12,636 images and embeddings accompanied by comprehensive metadata, from 81 municipalities in Colombia between 2016 and 2018. The dataset was then evaluated in 3 tasks: including dengue case prediction, poverty assessment, and access to education. The performance showcases the versatility and practicality of SatelliteBench, offering a reproducible, accessible and open tool to enhance decision-making in public health.