Pseudo support vector domain description to train large-size and continuously growing datasets

KNOWLEDGE AND INFORMATION SYSTEMS(2021)

引用 0|浏览5
暂无评分
摘要
Support vector domain description (SVDD) is a data description method inspired by support vector machine (SVM). This classifier describes a set of data points with a sphere that encloses the majority of them and has a minimal volume. The boundary of this sphere is used to classify new samples. SVDD has been successfully applied to many challenging classification problems and has shown a good generalization capability. However, this classifier still has some major weaknesses. This paper focuses on two of them: The first regards the large amount of memory and computational time required by SVDD in the training step. This problem manifests most strongly when dealing with large-size datasets and can hinder or prevent its use. This paper presents an approximate solution to this problem that permits to apply SVDD to large-scale datasets. This new version is based on divide-and-conquer strategy and it processes in two steps: It begins by dividing the whole large-size dataset into random subsets that each can be described efficiently with a small sphere using SVDD. Then, it applies our new algorithm that can find the smallest sphere that encloses the minimal spheres built in the previous step. The second weak point of standard SVDD concerns its static learning process. This classifier must be re-trained with the whole dataset each time when new training samples are available. This paper proposes a new dynamic approach that only trains the new samples with SVDD and incorporates the resulting minimal sphere with the previous one (s) to construct the smallest sphere that encloses all the samples. Like Support Vector Domain Description, the proposed approach can be extended to non-linear classification cases by using kernel functions. Experimental results on artificial and real datasets have successfully validated the performance of our approach.
更多
查看译文
关键词
Support vector domain description,Large-size dataset,Dynamic learning,Enclosing the minimal spheres by a minimal sphere,Feature space
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要