Login / Signup

A machine learning method for estimating the probability of presence using presence-background data.

Yan WangChathuri L SamarasekaraLewi Stone
Published in: Ecology and evolution (2022)
Estimating the prevalence or the absolute probability of the presence of a species from presence-background data has become a controversial topic in species distribution modelling. In this paper, we propose a new method by combining both statistics and machine learning algorithms that helps overcome some of the known existing problems. We have also revisited the popular but highly controversial Lele and Keim (LK) method by evaluating its performance and assessing the RSPF condition it relies on. Simulations show that the LK method with the RSPF assumptions would render fragile estimation/prediction of the desired probabilities. Rather, we propose the local knowledge condition, which relaxes the predetermined population prevalence condition that has so often been used in much of the existing literature. Simulations demonstrate the performance of the new method utilizing the local knowledge assumption to successfully estimate the probability of presence. The local knowledge extends the local certainty or the prototypical presence location assumption, and has significant implications for demonstrating the necessary condition for identifying absolute (rather than relative) probability of presence from presence background without absence data in species distribution modelling.
Keyphrases
  • machine learning
  • healthcare
  • big data
  • risk factors
  • electronic health record
  • mental health
  • molecular dynamics
  • artificial intelligence
  • data analysis
  • monte carlo