Login / Signup

New partition based measures for data compatibility and information gain.

Daoyuan ShiMing-Hui ChenLynn KuoPaul O Lewis
Published in: Statistics in medicine (2021)
It is of great practical importance to compare and combine data from different studies in order to carry out appropriate and more powerful statistical inference. We propose a partition based measure to quantify the compatibility of two datasets using their respective posterior distributions. We further propose an information gain measure to quantify the information increase (or decrease) in combining two datasets. These measures are well calibrated and efficient computational algorithms are provided for their calculations. We use examples in a benchmark dose toxicology study, a six cities pollution data and a melanoma clinical trial to illustrate how these two measures are useful in combining current data with historical data and missing data.
Keyphrases
  • electronic health record
  • big data
  • clinical trial
  • machine learning
  • randomized controlled trial
  • healthcare
  • artificial intelligence
  • study protocol
  • air pollution
  • density functional theory
  • monte carlo