谷歌浏览器插件
订阅小程序
在清言上使用

DataLad: distributed system for joint management of code, data, and their relationship.

J. Open Source Softw.(2021)

引用 49|浏览13
暂无评分
摘要
DataLad is a Python-based tool for the joint management of code, data, and their relationship, built on top of a versatile system for data logistics (git-annex) and the most popular distributed version control system (Git).It adapts principles of open-source software development and distribution to address the technical challenges of data management, data sharing, and digital provenance collection across the life cycle of digital objects.DataLad aims to make data management as easy as managing code.It streamlines procedures to consume, publish, and update data, for data of any size or type, and to link them as precisely versioned, lightweight dependencies.DataLad helps to make science more reproducible and FAIR (Wilkinson et al., 2016).It can capture complete and actionable process provenance of data transformations to enable automatic re-computation.The DataLad project (datalad.org)delivers a completely open, pioneering platform for flexible decentralized research data management (RDM) (Hanke, Pestilli, et al., 2021).It features a Python and a command-line interface, an extensible architecture, and does not depend on any centralized services but facilitates interoperability with a plurality of existing tools and services.In order to maximize its utility and target
更多
查看译文
关键词
Data Sharing,Distributed Storage,Data Reuse,Software Development,Parallel Computing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要