Login / Signup

Building use-inspired species distribution models: using multiple data types to examine and improve model performance.

Camrin D BraunMartin C ArosteguiNima FarchadiMichael AlexanderPedro AfonsoAndrew AllynSteven J BogradStephanie BrodieDaniel P CrearEmmett F CulhaneTobey H CurtisElliott L HazenAlex KerneyNerea Lezama-OchoaKatherine E MillsDylan PughNuno QueirozJames D ScottGregory B SkomalDavid W SimsSimon R ThorroldHeather WelchRiley Young-MorseRebecca Lewison
Published in: Ecological applications : a publication of the Ecological Society of America (2023)
Species distribution models (SDMs) are becoming an important tool for marine conservation and management. Yet while there is an increasing diversity and volume of marine biodiversity data for training SDMs, little practical guidance is available on how to leverage distinct data types to build robust models. We explored the effect of different data types on the fit, performance and predictive ability of SDMs by comparing models trained with four data types for a heavily exploited pelagic fish, the blue shark (Prionace glauca), in the Northwest Atlantic: two fishery-dependent (conventional mark-recapture tags, fisheries observer records) and two fishery-independent (satellite-linked electronic tags, pop-up archival tags). We found that all four data types can result in robust models, but differences among spatial predictions highlighted the need to consider ecological realism in model selection and interpretation regardless of data type. Differences among models were primarily attributed to biases in how each data type, and the associated representation of absences, sampled the environment and summarized the resulting species distributions. Outputs from model ensembles and a model trained on all pooled data both proved effective for combining inferences across data types and provided more ecologically realistic predictions than individual models. Our results provide valuable guidance for practitioners developing SDMs. With increasing access to diverse data sources, future work should further develop truly integrative modeling approaches that can explicitly leverage strengths of individual data types while statistically accounting for limitations, such as sampling biases.
Keyphrases
  • electronic health record
  • big data
  • study protocol
  • risk assessment
  • climate change
  • machine learning
  • deep learning
  • current status
  • placebo controlled