Chrome Extension
WeChat Mini Program
Use on ChatGLM

SPXYE: an improved method for partitioning training and validation sets

Cluster Computing(2018)

Cited 15|Views159
No score
Abstract
This study aimed to propose a sample selection strategy termed SPXYE (sample set partitioning based on joint X–Y–E distances) for data partition in multivariate modeling, where training and validation sets are required. This method was applied to choose the training set according to X (the independent variables), Y (the dependent variables), and E (the error of the preliminarily calculated results with the dependent variables) spaces. This selection strategy provided a valuable tool for multivariate calibration. The proposed technique SPXYE was applied to three household chemical molecular databases to obtain training and validation sets for partial least squares (PLS) modeling. For comparison, the training and validation sets were also generated using random sampling, Kennard–Stone, and sample set partitioning based on joint X–Y distances methods. The predictions of all associated PLS regression models were performed upon the same testing set, which was different from either the training set or the validation set. The results indicated that the proposed SPXYE strategy might serve as an alternative partition strategy.
More
Translated text
Key words
Chemical databases,Kennard–Stone,Partial least squares,Sample set partitioning based on joint X–Y distances,Set partition
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined