Practical guide for retaining correlated climate variables and unthinned samples in species distribution modeling, using random forests

ECOLOGICAL INFORMATICS(2024)

引用 0|浏览3
暂无评分
摘要
Species distribution models contain bias, or inaccurate predictions, due to predictors and species occurrences. Collinearity of predictor variables is a concern limited to standard error of estimates for data models, although issues remain about identification of important variables and model transferability, whereas thinning of species occurrences is not likely to improve samples without information about species characteristics. Here, I present a case study of selection of correlated climate variables and thinning of species samples for distributions of 80 tree species in North America, modeled with random forests. Robust important variable identification and temporal transferability with correlated climate variables by random forests was displayed by 1) greatest accuracy with all 14 input predictor variables, at 0.99 sensitivity and 0.97 specificity, whether from current or future climate, compared to variable selection approaches to reduce correlation, 2) greater accuracy for two variable models from foremost ranked important variables than least important variables, 3) models with more than two fixed input variables agreed on the four most important variables, 4) models with all 14 input variables from current and future climates identified the same five most important variables, and 5) mapped models of current species observations were similar with use of current or future climates. Unconstrained distributions were apparent in mapped models with only one preselected temperature and precipitation variable. Thinning samples produced no clear benefits to unthinned samples. Following standard guidelines to remove correlated variables before modeling will result in information loss and may lead to incorrect omission of relevant variables, at least with random forests. Predictive modeling is a practical approach for informing variable selection, including retention of correlated climate variables, which create a reliable basis for accurate predictions.
更多
查看译文
关键词
Collinearity,Correlation,Feature selection,Misconceptions,Thinning,Random forests
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要