Expanding Datacenter Capacity with DVFS Boosting: A safe and scalable deployment experience.

International Conference on Architectural Support for Programming Languages and Operating Systems(2024)

Cited 0|Views12
No score
Abstract
COVID-19 pandemic created unexpected demand for our physical infrastructure. We increased our computing supply by growing our infrastructure footprint as well as expanded existing capacity by using various techniques among those DVFS boosting. This paper describes our experience in deploying DVFS boosting to expand capacity. There are several challenges in deploying DVFS boosting at scale. First, frequency scaling incurs additional power demand, which can exacerbate power over-subscription and incur unexpected capacity loss for the services due to power capping. Second, heterogeneity is commonplace in any large scale infrastructure. We need to deal with the service and hardware heterogeneity to determine the optimal setting for each service and hardware type. Third, there exists a long tail of services with scarce resources and support for performance evaluation. Finally and most importantly, we need to ensure that large scale changes to CPU frequency do not risk the reliability of the services and the infrastructure. We present our solution that has overcome the above challenges and has been running in production for over 3 years. It created 12 MW of supply which is equivalent to building and populating half a datacenter in our fleet. In addition to the real world performance of our solution, we also share our key takeaways to improve fleetwide efficiency via DVFS boosting in a safe manner.
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined