An In-Depth Analysis of Data Reduction Methods for Sustainable Deep Learning
CoRR(2024)
摘要
In recent years, Deep Learning has gained popularity for its ability to solve
complex classification tasks, increasingly delivering better results thanks to
the development of more accurate models, the availability of huge volumes of
data and the improved computational capabilities of modern computers. However,
these improvements in performance also bring efficiency problems, related to
the storage of datasets and models, and to the waste of energy and time
involved in both the training and inference processes. In this context, data
reduction can help reduce energy consumption when training a deep learning
model. In this paper, we present up to eight different methods to reduce the
size of a tabular training dataset, and we develop a Python package to apply
them. We also introduce a representativeness metric based on topology to
measure how similar are the reduced datasets and the full training dataset.
Additionally, we develop a methodology to apply these data reduction methods to
image datasets for object detection tasks. Finally, we experimentally compare
how these data reduction methods affect the representativeness of the reduced
dataset, the energy consumption and the predictive performance of the model.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要