Predicting potential distributions of geographic events using one-class data: concepts and methods
ecological niche modelling, geographic one-class data, Maximum entropy, one-class support vector machine, positive and unlabelled learning
One common problem with geographic data is that, for a specific geographic event, only occurrence information is available; information about the absence of the event is not available.We refer to these specific types of geospatial data as geographic one-class data (GOCD). Predicting the potential spatial distributions that a particular geographic event may occur from GOCD is difficult because traditional binary classification meth- ods that require availability of both positive and negative training samples cannot be used. The objective of this research is to define GOCD and propose novel approaches for modelling potential spatial distributions of geographic events using GOCD. We investigate the effectiveness of one-class support vector machine (OCSVM),maximum entropy (MAXENT) and the newly proposed positive and unlabelled learning (PUL) algorithm for solving GOCD problems using a case study: species distribution mod- elling from synthetic data. Our experimental results indicate that generally OCSVM, MAXENT and PUL are effective in modelling the GOCD. Each method has advantages and disadvantages, but PUL seems to be the most promising method.