Partial and complete dependency among data sets has minimal consequence on estimates from integrated population models.
Mitch D WeegmanTodd W ArnoldRobert G ClarkMichael SchaubPublished in: Ecological applications : a publication of the Ecological Society of America (2021)
Integrated population models (IPMs) are widely used to combine disparate data sets in joint analysis to better understand population dynamics and provide guidance for conservation activities. An often-cited assumption of IPMs is independence among component data sets within the combined likelihood. Dependency among data sets should lead to underestimation of variance and bias because individuals contribute data to more than one data set. In practice, studied individuals often occur in multiple data sets in IPMs (i.e., overlap), which is one way for the independence assumption to be violated. Such cases have the potential to dissuade practitioners and limit application of IPMs to solve emerging ecological problems. We assessed precision and bias of demographic rates estimated from IPMs using a complete gradient (0-100%) of overlap among data sets, wide ranges in demographic rates (e.g., survival 0.1-0.8) and sample sizes (100-1,200 individuals) and variable data sources. We compared results from our simulations with those from IPMs constructed using empirical data on tree swallows (Tachycineta bicolor) where data sets either had complete overlap or included different individuals. Contrary to previous investigators, we found no substantive bias or uncertainty in any demographic rate from IPMs derived from data sets with complete overlap. While variability in demographic rates was greater at low sample sizes (i.e., low capture, recapture, and survey probabilities), there were negligible differences in the posterior mean or root mean square error of demographic rates among IPMs with strong dependence vs. complete independence among data sets. Our simulations suggest IPMs can be designed using only capture-recapture data or harvest and capture-recovery data where population estimates are obtained from the same data as survival and productivity data. While we encourage researchers to carefully consider the modeling approach best suited for their data sets, our results suggest that dependence among data sets does not generally compromise IPM estimates. Thus, violation of the independence assumption should not dissuade researchers from the application of IPMs in ecological research.