Login / Signup

Data integration methods to account for spatial niche truncation effects in regional projections of species distribution.

Mathieu ChevalierOlivier BroennimannJosselin CornuaultAntoine Guisan
Published in: Ecological applications : a publication of the Ecological Society of America (2021)
Many species distribution models (SDMs) are built with precise but geographically restricted presence-absence data sets (e.g., a country) where only a subset of the environmental conditions experienced by a species across its range is considered (i.e., spatial niche truncation). This type of truncation is worrisome because it can lead to incorrect predictions e.g., when projecting to future climatic conditions belonging to the species niche but unavailable in the calibration area. Data from citizen-science programs, species range maps or atlases covering the full species range can be used to capture those parts of the species' niche that are missing regionally. However, these data usually are too coarse or too biased to support regional management. Here, we aim to (1) demonstrate how varying degrees of spatial niche truncation affect SDMs projections when calibrated with climatically truncated regional data sets and (2) test the performance of different methods to harness information from larger-scale data sets presenting different spatial resolutions to solve the spatial niche truncation problem. We used simulations to compare the performance of the different methods, and applied them to a real data set to predict the future distribution of a plant species (Potentilla aurea) in Switzerland. SDMs calibrated with geographically restricted data sets expectedly provided biased predictions when projected outside the calibration area or time period. Approaches integrating information from larger-scale data sets using hierarchical data integration methods usually reduced this bias. However, their performance varied depending on the level of spatial niche truncation and how data were combined. Interestingly, while some methods (e.g., data pooling, downscaling) performed well on both simulated and real data, others (e.g., those based on a Poisson point process) performed better on real data, indicating a dependency of model performance on the simulation process (e.g., shape of simulated response curves). Based on our results, we recommend to use different data integration methods and, whenever possible, to make a choice depending on model performance. In any case, an ensemble modeling approach can be used to account for uncertainty in how niche truncation is accounted for and identify areas where similarities/dissimilarities exist across methods.
Keyphrases
  • electronic health record
  • big data
  • public health
  • machine learning
  • healthcare
  • climate change
  • data analysis
  • artificial intelligence
  • convolutional neural network