A High Performance Hierarchical Storage Management System For The Canadian Tier-1 Centre At Triumf

D C Deatrich,S X Liu, R Tafirout

17TH INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS (CHEP09)(2010)

引用 0|浏览3
暂无评分
摘要
We describe in this paper the design and implementation of Tapeguy, a high performance non-proprietary Hierarchical Storage Management (HSM) system which is interfaced to dCache for efficient tertiary storage operations. The system has been successfully implemented at the Canadian Tier-1 Centre at TRIUMF. The ATLAS experiment will collect a large amount of data (approximately 3.5 Petabytes each year). An efficient HSM system will play a crucial role in the success of the ATLAS Computing Model which is driven by intensive large-scale data analysis activities that will be performed on the Worldwide LHC Computing Grid infrastructure continuously. Tapeguy is Perl-based. It controls and manages data and tape libraries. Its architecture is scalable and includes Dataset Writing control, a Read-back Queuing mechanism and I/O tape drive load balancing as well as on-demand allocation of resources. A central MySQL database records metadata information for every file and transaction (for audit and performance evaluation), as well as an inventory of library elements. Tapeguy Dataset Writing was implemented to group files which are close in time and of similar type. Optional dataset path control dynamically allocates tape families and assign tapes to it. Tape flushing is based on various strategies: time, threshold or external callbacks mechanisms. Tapeguy Readback Queuing reorders all read requests by using an elevator algorithm, avoiding unnecessary tape loading and unloading. Implementation of priorities will guarantee file delivery to all clients in a timely manner.
更多
查看译文
关键词
load balance,computer model,data analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要