Species distribution modelling—Effect of design and sample size of pseudo-absence observations
AUC, Ecological niche model, Presence only, Pseudo-absence, Species distribution model
We explored the effect of varying pseudo-absence data in species distribution modelling using empirical data for four real species and simulated data for two imaginary species. In all analyses we used a fixed study area, a fixed set of environmental predictors and a fixed set of presence observations. Next, we added pseudo-absence data generated by different sampling designs and in different numbers to assess their relative importance for the output from the species distribution model. The sampling design strongly influenced the predictive performance of the models while the number of pseudo-absences had minimal effect on the predictive performance. We attribute much of these results to the relationship between the environmental range of the pseudo-absences (i.e. the extent of the environmental space being considered) and the environmental range of the presence observations (i.e. under which environmental conditions the species occurs). The number of generated pseudo-absences had a direct effect on the predicted probability, which translated to different distribution areas. Pseudo-absence observations that fell within grid cells with presence observations were purposely included in our analyses. We discourage the practice of excluding certain pseudo-absence data because it involves arbitrary assumptions about what are (un)suitable environments for the species being modelled.