Chrome Extension
WeChat Mini Program
Use on ChatGLM

Random-Splitting Random Forest with Multiple Mixed-Data Covariates

Mohammad Fayaz,Alireza Abadi, Soheila Khodakarimd

Journal of Biostatistics and Epidemiology(2023)

Cited 0|Views9
No score
Abstract
Background: The bagging (BG) and random forest (RF) are famous supervised statistical learning methods based on classification and regression trees. The BG and RF can deal with different types of responses such as categorical, continuous, etc. There are curves, time series, functional data, or observations that are related to each other based on their domain in many statistical applications. The RF methods are extended to some cases for functional data as covariates or responses in many pieces of literature. Among them, random-splitting is used to summarize the functional data to the multiple related summary statistics such as average, etc. Methods: This research article extends this method and introduces the mixed data BG (MD-BG) and RF (MD-RF) algorithm for multiple functional and non-functional, or mixed and hybrid data, covariates and it calculates the variable importance plot (VIP) for each covariate. Results: The main differences between MD-BG and MD-RF are in choosing the covariates that in the first, all covariates remain in the model but the second uses a random sample of covariates. The MD-RF helps to unmask the most important parts of functional covariates and the most important non-functional covariates. Conclusions: We apply our methods on the two datasets of DTI and Tecator and compare their performances for continuous and categorical responses with the developed R package (“RSRF”) in the GitHub.
More
Translated text
Key words
Bagging,Functional data,Random forest,Random splitting,Statistical learning
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined