Prevalence of lung cancer in Colombia and a new diagnostic algorithm using health administrative databases: A real-world evidence study.
Javier Amaya-NietoGabriel TorresGiancarlo Buitrago BuitragoPublished in: PloS one (2023)
Reliable, timely and detailed information on lung cancer prevalence, mortality and costs from middle-income countries is essential to policy design. Thus, we aimed to develop an electronic algorithm to identify lung cancer prevalent patients in Colombia by using administrative claims databases, as well as to estimate prevalence rates by age, sex and geographic region. We performed a cross-sectional study based on national claim databases in Colombia (Base de datos de suficiencia de la Unidad de Pago por Capitación and Base de Datos Única de Afiliados) to identify lung cancer prevalent patients in 2017, 2018 and 2019. Several algorithms based on the presence or absence of oncological procedures (chemotherapy, radiotherapy and surgery) and a minimum number of months that each individual had lung cancer ICD-10 codes were developed. After testing 16 algorithms, those with the closest prevalence rates to those rates reported by aggregated official sources (Global Cancer Observatory and Cuenta de Alto Costo) were selected. We estimated prevalence rates by age, sex and geographic region. Two algorithms were selected: i) one algorithm that was defined as the presence of ICD-10 codes for 4 months or more (the sensitive algorithm); and ii) one algorithm that was defined by adding the presence of at least one oncological procedure (the specific algorithm). The estimated prevalence rates per 100,000 inhabitants ranged between 11.14 and 18.05 for both, the contributory and subsidized regimes over years 2017, 2018 and 2019. These rates in the contributory regime were higher in women (15.43, 15.61 and 17.03 per 100,000 for years 2017, 2018 and 2019), over 65-years-old (63.45, 56.92 and 61.79 per 100,000 for years 2017, 2018 and 2019) who lived in Central, Bogota and Pacific regions. Selected algorithms showed similar aggregated prevalence estimations to those rates reported by official sources and allowed us to estimate prevalence rates in specific aging, regional and gender groups for Colombia by using national claims databases. These findings could be useful to identify clinical and economical outcomes related to lung cancer patients by using national individual-level databases.
Keyphrases
- machine learning
- risk factors
- deep learning
- end stage renal disease
- public health
- healthcare
- chronic kidney disease
- prostate cancer
- big data
- mental health
- minimally invasive
- ejection fraction
- squamous cell carcinoma
- physical activity
- prognostic factors
- quality improvement
- pregnant women
- type diabetes
- early stage
- cardiovascular disease
- young adults
- drinking water
- insulin resistance
- risk assessment
- social media
- polycystic ovary syndrome
- human health
- locally advanced