An Automated Bioinformatics Pipeline Informing Near-Real-Time Public Health Responses to New HIV Diagnoses in a Statewide HIV Epidemic.

Mark HowisonFizza S GillaniVlad NovitskyJon A SteingrimssonJohn FultonThomas BertrandKatharine HoweAnna CivitareseLila BhattaraiMeghan MacAskillGuillermo RonquilloJoel HagueCasey W DunnUtpala BandyJoseph W HoganRami Kantor

Published in: Viruses (2023)

Molecular HIV cluster data can guide public health responses towards ending the HIV epidemic. Currently, real-time data integration, analysis, and interpretation are challenging, leading to a delayed public health response. We present a comprehensive methodology for addressing these challenges through data integration, analysis, and reporting. We integrated heterogeneous data sources across systems and developed an open-source, automatic bioinformatics pipeline that provides molecular HIV cluster data to inform public health responses to new statewide HIV-1 diagnoses, overcoming data management, computational, and analytical challenges. We demonstrate implementation of this pipeline in a statewide HIV epidemic and use it to compare the impact of specific phylogenetic and distance-only methods and datasets on molecular HIV cluster analyses. The pipeline was applied to 18 monthly datasets generated between January 2020 and June 2022 in Rhode Island, USA, that provide statewide molecular HIV data to support routine public health case management by a multi-disciplinary team. The resulting cluster analyses and near-real-time reporting guided public health actions in 37 phylogenetically clustered cases out of 57 new HIV-1 diagnoses. Of the 37, only 21 (57%) clustered by distance-only methods. Through a unique academic-public health partnership, an automated open-source pipeline was developed and applied to prospective, routine analysis of statewide molecular HIV data in near-real-time. This collaboration informed public health actions to optimize disruption of HIV transmission.

Keyphrases