Chrome Extension
WeChat Mini Program
Use on ChatGLM

Sensitivity analysis to configuration option settings in a selection of species distribution modelling algorithms

Syme, G., Hatton MacDonald, D., Fulton, B. and Piantadosi, J. (eds) MODSIM2017, 22nd International Congress on Modelling and Simulation.(2017)

Cited 0|Views3
No score
Abstract
In pursuit of a more robust provenance in the field of species distribution modelling, an extensive literature search was undertaken to find the typical default values, and the range of values, for configuration settings of a number of the most commonly used statistical algorithms available for constructing species distribution models (SDM), as implemented in the R script packages (such as Dismo and Biomod2) or other species distribution modelling programs like Maxent. We found that documentation of SDM algorithm configuration option settings in the SDM literature is very uncommon, and the justifications for these settings were minimal, when present. Such settings were often the R default values, or were the result of trial and error. This is potentially concerning for a number of reasons; it detracts from the robustness of the provenance for such SDM studies; a lack of documentation of configuration option settings in a paper prevents the replication of an experiment, which contravenes one of the main tenets of the scientific method. Inappropriate or uninformed configuration option settings are particularly concerning if they represent a poorly understood ecological variable or process, and if the algorithm is sensitive to such settings; this could result in erroneous and/or unrealistic SDMs. We test the sensitivity of two commonly used SDM algorithms to variation in configuration options settings: Random Forests and Boosted Regression Trees. A process of expert elicitation was used to derive a range of appropriate values with which to test the sensitivity of our algorithms. We chose to use species occurrence records for the Koala (Phascolartos cinereus) for our sensitivity tests, since the species has a well known distribution. Results were assessed by comparing the geospatial distribution from each sensitivity test (i.e. altered-settings) SDM for differences compared to the control SDM (i.e. default settings), using geographical information systems (QGIS). In addition, two performance measures were used to compare differences among the altered-setting SDMs to the control. The aim of our study was to be able to draw conclusions as to how reliable reported SDM results may be in light of the sensitivity of their algorithms to certain settings, given the often arbitrary nature of such settings, and the lack of awareness of, and/or attendance to this issue in most of the published SDM literature. Our results indicate that all two algorithms tested showed sensitivity to alternate values for some of their settings. Therefore this study has showed that the choice of configuration option settings in Random Forests and Boosted Regression Trees has an impact on the results, and that assigning suitable values for these settings is a relevant consideration and as such should be always published along with the model.
More
Translated text
Key words
species distribution modelling algorithms,configuration option settings,sensitivity analysis,selection
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined