Login / Signup

Interpolation of microbiome composition in longitudinal data sets.

Omri PelegElhanan Borenstein
Published in: mBio (2024)
Since missing samples are common in longitudinal microbiome dataset due to inconsistent collection practices, it is important to evaluate and benchmark different interpolation methods for predicting microbiome composition in such samples and facilitate downstream analysis. Our study rigorously evaluated several such methods and identified the K-nearest neighbors approach as particularly effective for this task. The study also notes significant variability in interpolation accuracy among individuals, influenced by factors such as age, sample size, and sampling frequency. Furthermore, we developed a predictive model for estimating interpolation accuracy at a specific time point, enhancing the reliability of such analyses in future studies. Combined, our study, thus, provides critical insights and tools that enhance the accuracy and reliability of data interpolation methods in the growing field of longitudinal microbiome research.
Keyphrases
  • healthcare
  • electronic health record
  • machine learning
  • big data
  • artificial intelligence
  • data analysis
  • case control