Subset selection of training data for machine learning: a situational awareness system case study

Proceedings of SPIE(2015)

引用 1|浏览9
暂无评分
摘要
Recent advances in machine learning with big data sets has allowed for significant advances in the optimisation of classification and recognition systems. However, for applications such as situational awareness systems, the entirety of the available data dwarfs the amount permissible for a training set with tractable machine learning optimization times. Furthermore, the performance of any optimized system is highly dependent of the training set correctly and completely representing the entire data space of scenarios. In this paper we present a technique to characterize the entire data space to ascertain the key factors for representation and subsequently select a subset that statistically represents the correct mix of scenarios. We demonstrate the effectiveness of these characterization and subset selection techniques by using a genetic algorithm to optimize the performance of a gunfire recognition system.
更多
查看译文
关键词
Machine learning,big data,optimisation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要