Login / Signup

Revisiting incidence rates comparison under right censorship.

Pablo Martinez-CamblorSusana Díaz-Coto
Published in: The international journal of biostatistics (2023)
Data description is the first step for understanding the nature of the problem at hand. Usually, it is a simple task that does not require any particular assumption. However, the interpretation of the used descriptive measures can be a source of confusion and misunderstanding. The incidence rate is the quotient between the number of observed events and the sum of time that the studied population was at risk of having this event (person-time). Despite this apparently simple definition, its interpretation is not free of complexity. In this piece of research, we revisit the incidence rate estimator under right-censorship. We analyze the effect that the censoring time distribution can have on the observed results, and its relevance in the comparison of two or more incidence rates. We propose a solution for limiting the impact that the data collection process can have on the results of the hypothesis testing. We explore the finite-sample behavior of the considered estimators from Monte Carlo simulations. Two examples based on synthetic data illustrate the considered problem. The R code and data used are provided as Supplementary Material.
Keyphrases
  • electronic health record
  • risk factors
  • monte carlo
  • big data
  • machine learning
  • molecular dynamics