Model Compression in Practice: Lessons Learned from Practitioners Creating On-device Machine Learning Experiences
arxiv(2023)
摘要
On-device machine learning (ML) promises to improve the privacy,
responsiveness, and proliferation of new, intelligent user experiences by
moving ML computation onto everyday personal devices. However, today's large ML
models must be drastically compressed to run efficiently on-device, a hurtle
that requires deep, yet currently niche expertise. To engage the broader
human-centered ML community in on-device ML experiences, we present the results
from an interview study with 30 experts at Apple that specialize in producing
efficient models. We compile tacit knowledge that experts have developed
through practical experience with model compression across different hardware
platforms. Our findings offer pragmatic considerations missing from prior work,
covering the design process, trade-offs, and technical strategies that go into
creating efficient models. Finally, we distill design recommendations for
tooling to help ease the difficulty of this work and bring on-device ML into to
more widespread practice.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要