A Polystore Architecture Using Knowledge Graphs to Support Queries on Heterogeneous Data Stores
arxiv(2023)
摘要
Modern applications commonly need to manage dataset types composed of
heterogeneous data and schemas, making it difficult to access them in an
integrated way. A single data store to manage heterogeneous data using a common
data model is not effective in such a scenario, which results in the domain
data being fragmented in the data stores that best fit their storage and access
requirements (e.g., NoSQL, relational DBMS, or HDFS). Besides, organization
workflows independently consume these fragments, and usually, there is no
explicit link among the fragments that would be useful to support an integrated
view. The research challenge tackled by this work is to provide the means to
query heterogeneous data residing on distinct data repositories that are not
explicitly connected. We propose a federated database architecture by providing
a single abstract global conceptual schema to users, allowing them to write
their queries, encapsulating data heterogeneity, location, and linkage by
employing: (i) meta-models to represent the global conceptual schema, the
remote data local conceptual schemas, and mappings among them; (ii) provenance
to create explicit links among the consumed and generated data residing in
separate datasets. We evaluated the architecture through its implementation as
a polystore service, following a microservice architecture approach, in a
scenario that simulates a real case in Oil & Gas industry. Also, we compared
the proposed architecture to a relational multidatabase system based on foreign
data wrappers, measuring the user's cognitive load to write a query (or query
complexity) and the query processing time. The results demonstrated that the
proposed architecture allows query writing two times less complex than the one
written for the relational multidatabase system, adding an excess of no more
than 30
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要