The central role of data repositories and data models in Data Science and Advanced Analytics

Future Generation Computer Systems(2022)

引用 4|浏览9
暂无评分
摘要
In the age of “Data Science and Advanced Analytics”, we are witnessing a race for developing data-driven smart systems in various domains such as business, finance, healthcare, environment, cybersecurity, etc. due the explosion of the data issued by various providers. This development contributes in getting added value for companies and citizens. Two complementary ingredients are required for ensuring valuable systems: data and models. The data dimension is mainly related to Data Science that unifies machine learning, statistics, data mining, databases, and distributed systems. The achievement of this value may pass through the augmentation of input data by resources such as Knowledge Graphs. The success of the above techniques strongly depends on the quality of the input data and the consideration of other non-functional properties related to legal, ethical, and economical aspects. On the other hand, modeling plays a crucial role in Data Science since it covers all steps of Data Science workflow. Regarding data provenance and its quality, models contribute to providing vendor-independent solutions. At the algorithmic level, models help in explaining the inner working of the used methods/algorithms to system designers, users, regulators, and citizens to achieve trust and accountability. Therefore, the success of Data Science depends on our skill to use it a smart way and simultaneously exploiting data and modeling capabilities.
更多
查看译文
关键词
00-01,99-00
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要