A Brazilian classified data set for prognosis of tuberculosis, between January 2001 and April 2020.
Maicon Herverton Lino Ferreira da Silva BarrosGuto Leoni SantosMaria Gabriela de Almeida RodriguesVanderson de Souza SampaioTheo LynnPatricia Takako EndoPublished in: Scientific data (2022)
After COVID-19, tuberculosis (TB) is the leading cause of death by an infectious disease in the world. This work presents a data set based on data collected from the Brazilian Information System for Notifiable Diseases (SINAN) for the period from January 2001 to April 2020 relating to patients diagnosed with tuberculosis in Brazil. The data from SINAN was pre-processed to generate a new data set with two distinct treatment outcome classes: CURED and DIED. The data set comprises 37 categorical attributes (including socio-demographic, clinical, and laboratory data) as well as the target class. There are 927,909 records of patients classified as CURED and 36,190 classified as DIED, totaling 964,099 records.
Keyphrases
- electronic health record
- big data
- end stage renal disease
- mycobacterium tuberculosis
- ejection fraction
- chronic kidney disease
- healthcare
- emergency department
- prognostic factors
- machine learning
- hiv aids
- peritoneal dialysis
- artificial intelligence
- human immunodeficiency virus
- infectious diseases
- adverse drug
- patient reported