On The Efficiency of Sparse-Tiled Tensor Graph Processing For Low Memory Usage
2021 58TH ACM/IEEE DESIGN AUTOMATION CONFERENCE (DAC)(2021)
摘要
The memory space taken to host and process large tensor graphs is a limiting factor for embedded ConvNets. Even though many data-driven compression pipelines have proven their efficacy, this work shows there is still room for optimization at the intersection with compute-oriented optimizations. We demonstrate that tensor pruning via weight sparsification can cooperate with a model-agnostic tiling strategy, leading ConvNets towards a new feasible region of the solution space. The collected results show for the first time fast versions of MobileNets deployed at full scale on an ARM M7 core with 512KB of RAM and 2MB of FLASH memory.
更多查看译文
关键词
sparse-tiled tensor graph processing,low memory,memory space,tensor graphs,limiting factor,embedded ConvNets,data-driven compression pipelines,compute-oriented optimizations,tensor pruning,weight sparsification,model-agnostic tiling strategy,solution space,FLASH memory,memory size 2.0 MByte,memory size 512.0 KByte
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络