FAIR Machine Learning Model Pipeline Implementation of COVID-19 Data

DATA INTELLIGENCE(2022)

引用 3|浏览0
暂无评分
摘要
Research and development are gradually becoming data-driven and the implementation of the FAIR Guidelines (that data should be Findable, Accessible, Interoperable, and Reusable) for scientific data administration and stewardship has the potential to remarkably enhance the framework for the reuse of research data. In this way, FAIR is aiding digital transformation. The 'FAIRification' of data increases the interoperability and (re)usability of data, so that new and robust analytical tools, such as machine learning (ML) models, can access the data to deduce meaningful insights, extract actionable information, and identify hidden patterns. This article aims to build a FAIR ML model pipeline using the generic FAIRification workflow to make the whole ML analytics process FAIR. Accordingly, FAIR input data was modelled using a FAIR ML model. The output data from the FAIR ML model was also made FAIR. For this, a hybrid hierarchical k-means (HHK) clustering ML algorithm was applied to group the data into homogeneous subgroups and ascertain the underlying structure of the data using a Nigerian-based FAIR dataset that contains data on economic factors, healthcare facilities, and coronavirus occurrences in all the 36 states of Nigeria. The model showed that research data and the ML pipeline can be FAIRified, shared, and reused by following the proposed FAIRification workflow and implementing technical architecture.
更多
查看译文
关键词
FAIRification,Semantic data model,Cluster analysis,FAIR data,Metadata,Machine learning model
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要