Evolution Of Monitoring, Accounting And Alerting Services At Infn-Cnaf Tier-1

Stefano Dal Pra,Antonio Falabella,Enrico Fattibene, Gianluca Cincinelli,Matteo Magnani, Tiziano De Cristofaro, Martin Ruini

23RD INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS (CHEP 2018)(2019)

引用 1|浏览1
暂无评分
摘要
CNAF is the national center of INFN (Italian National Institute for Nuclear Physics) for IT technology services. The Tier-1 data center operated at CNAF offers computing and storage resources to scientific communities as those working on the four experiments of LHC (Large Hadron Collider) at CERN and other 30 experiments in which INFN is involved. In past years, monitoring and alerting services for Tier-1 resources were performed with several software, such as LEMON (developed at CERN and customized on the characteristics of datacenters managing scientific data), Nagios (especially used for alerting purposes) and a system based on Graphite database and other ad-hoc developed services and web pages. By 2015, a task force has been organized with the purpose of defining and deploying a common infrastructure (based on Sensu, InfluxDB and Grafana) to be exploited by the different CNAF departments. Once the new infrastructure was deployed, a major task was then to adapt the whole monitoring and alerting services. We are going to present the steps that the Tier-1 group followed in order to accomplish a full migration, that is now completed with all the new services in production. In particular we will show the monitoring sensors and alerting checks redesign to adapt them to the infrastructure base on the Sensu software, the web dashboards creation for data presentation, the porting of historical data from LEMON/Graphite to InfluxDB.
更多
查看译文
关键词
monitoring,accounting,services,infn-cnaf
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要