Exploiting Metadata Semantics in Data Lakes Using Blueprints.

ENASE (Selected Papers)(2022)

引用 0|浏览5
暂无评分
摘要
Smart processing of Big Data has been recently emerged as a field that provides quite a few challenges related to how multiple heterogeneous data sources that produce massive amounts of structured, semi-structured and unstructured data may be handled. One solution to this problem is manage this fusion of disparate data sources through Data Lakes. The latter, though, suffers from the lack of a disciplined approach to collect, store and retrieve data to support predictive and prescriptive analytics. This chapter tackles this challenge by introducing a novel standardization framework for managing data in Data Lakes that combines mainly the 5Vs Big Data characteristics and blueprint ontologies. It organizes a Data Lake using a ponds architecture and describes a metadata semantic enrichment mechanism that enables fast storing to and efficient retrieval. The mechanism supports Visual Querying and offers increased security via Blockchain and Non-Fungible Tokens. The proposed approach is compared against other known metadata systems utilizing a set of functional properties with very encouraging results.
更多
查看译文
关键词
metadata semantics,metadata lakes,blueprints
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要