Optimizing Neural Network Training through TensorFlow Profile Analysis in a Shared Memory System

Anais Estendidos do Simpósio em Sistemas Computacionais de Alto Desempenho (WSCAD)(2019)

引用 0|浏览1
暂无评分
摘要
On the one hand, Deep Neural Networks have emerged as a powerful tool for solving complex problems in image and text analysis. On the other, they are sophisticated learning machines that require deep programming and math skills to be understood and implemented. Therefore, most researchers employ toolboxes and frameworks to design and implement such architectures. This paper performs an execution analysis of TensorFlow, one of the most used deep network frameworks available, on a shared memory system. To do so, we chose a text classification problem based on tweets sentiment analysis. The focus of this work is to identify the best environment configuration for training neural networks on a shared memory system. We set five different configurations using environment variables to modify the TensorFlow execution behavior. The results on an Intel Xeon Platinum 8000 processors series show that the default environment configuration of the TensorFlow can increase the speed up to 5.8. But, fine-tuning this environment can improve the speedup at least 37%.
更多
查看译文
关键词
tensorflow profile analysis,neural network training,neural network,memory,optimizing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要