Chrome Extension
WeChat Mini Program
Use on ChatGLM

Building a Scalable Infrastructure: To grow computational research and enhance collaboration across the research enterprise

PEARC '20: Practice and Experience in Advanced Research Computing Portland OR USA July, 2020(2020)

Cited 0|Views3
No score
Abstract
This paper documents our experience in building a scalable cyberinfrastructure to grow computational research and enhance collaboration across the research enterprise. It describes a modernized research computing system built on the principles of Software Defined Infrastructure (SDI) and DevOps. This approach helps develop, test and deploy enhancements to our High Performance Computing (HPC) platform. By separating development from production, the approach offers opportunities to train developers new to distributed systems and HPC platforms, safely gaining advanced skills in demand but not common in today’s job market. Providing a deployment pipeline empowers teams with varying skill levels to effectively contribute enhancements to HPC platforms extending the capacity of operations professionals. The paper highlights our experience in setting up a web frontend for HPC resources in the form of Open OnDemand. The Open OnDemand web interface new users transition from their dedicated workstations to the HPC ecosystem. Most recently this model has been extended to manifest a DataOps team focused on analyzing operational cluster data sets. Their analysis uses XDMOD and Jupyter notebooks, available on the cluster through Open OnDemand, to review wait times and utilization providing direct feedback on scheduling policy. This review helps validate user experiences and ensures we define our infrastructure in a way that is most useful for our researchers. This knowledge has contributed to optimize workflows for research teams on campus. An open-source SDI and DevOps principles enable broader collaboratations with with research teams on and off campus, delivering rapid improvements across teams that can share processes. This discussion further highlights the importance to collaboration of open-source tooling and sites like GitHub.com and self-hosted community edition of GitLab.
More
Translated text
Key words
scalable infrastructure,building
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined