BRAX, Brazilian labeled chest x-ray dataset.
Eduardo Pontes ReisJoselisa Péres Queiroz de PaivaMaria C B da SilvaGuilherme A S RibeiroVictor F PaivaLucas BulgarelliHenrique Min Ho LeePaulo Victor Dos SantosVanessa M BritoLucas T W AmaralGabriel L BeraldoJorge N Haidar FilhoGustavo B S TelesGilberto SzarfTom J PollardAlistair Edward William JohnsonLeo A CeliEdson Amaro JúniorPublished in: Scientific data (2022)
Chest radiographs allow for the meticulous examination of a patient's chest but demands specialized training for proper interpretation. Automated analysis of medical imaging has become increasingly accessible with the advent of machine learning (ML) algorithms. Large labeled datasets are key elements for training and validation of these ML solutions. In this paper we describe the Brazilian labeled chest x-ray dataset, BRAX: an automatically labeled dataset designed to assist researchers in the validation of ML models. The dataset contains 24,959 chest radiography studies from patients presenting to a large general Brazilian hospital. A total of 40,967 images are available in the BRAX dataset. All images have been verified by trained radiologists and de-identified to protect patient privacy. Fourteen labels were derived from free-text radiology reports written in Brazilian Portuguese using Natural Language Processing.
Keyphrases
- machine learning
- deep learning
- artificial intelligence
- high resolution
- pet imaging
- case report
- big data
- ejection fraction
- newly diagnosed
- optical coherence tomography
- prognostic factors
- adverse drug
- magnetic resonance
- virtual reality
- computed tomography
- photodynamic therapy
- patient reported outcomes
- smoking cessation
- single cell
- rna seq
- image quality
- case control
- clinical evaluation