Login / Signup

New approach and new permutation tests with R programs for analyses of false-negative-contaminated data in medicine and biology.

Jaroslav FlegrPetr Tureček
Published in: Biology open (2020)
Statistically, the concentration of antibodies against parasites decreases with the duration of infection. This can result in false-negative outcomes of diagnostic tests for subjects with old infections. When a property of seronegative and seropositive subjects is compared under these circumstances, the statistical tests can detect no difference between these two groups of subjects, despite the fact that they differ. When the effect of the infection has a cumulative character and subjects with older infections are affected to a greater degree, we may even get paradoxical results of the comparison - the seropositive subjects have, on average, a higher value of certain traits despite the infection having a negative effect on those traits. A permutation test for the contaminated data implemented, e.g. in the program Treept or available as a comprehensibly commented R function at https://github.com/costlysignalling/Permutation_test_for_contaminated_data, can be used to reveal and to eliminate the effect of false negatives. A Monte Carlo simulation in the program R showed that our permutation test is a conservative test - it could provide false negative, but not false positive, results if the studied population contains no false-negative subjects. A new R version of the test was expanded by skewness analysis, which helps to estimate the proportion of false-negative subjects based on the assumption of equal data skewness in groups of healthy and infected subjects. Based on the results of simulations and our experience with empirical studies we recommend the usage of a permutation test for contaminated data whenever seronegative and seropositive individuals are compared.
Keyphrases
  • electronic health record
  • heavy metals
  • big data
  • drinking water
  • monte carlo
  • genome wide
  • public health
  • metabolic syndrome
  • type diabetes
  • machine learning
  • quality improvement
  • skeletal muscle
  • community dwelling