A Taxonomy of Dataset Search

Abdullah Hassan Almuntashiri,Luis‐Daniel Ibáñez, Adriane Chapman

Lecture notes on data engineering and communications technologies(2023)

引用 0|浏览7
The demand for and use of data have increased in all life science domains, particularly in scientific communities. Data is organised into datasets which are used in many tasks, e.g. training machine learning (ML) models. Those datasets are stored either privately or publicly in repositories or data portals that can be published on the Web. Due to the need to find and reuse datasets, a new research field has appeared that focuses on the process of searching datasets to meet users’ needs. Therefore, the purpose of this paper is to explore the dataset search literature in order to identify the used methods, algorithms, systems and benchmarks and then classify them. We performed a complete search of the dataset search literature on various search engines, scientific sites and digital libraries. We discovered more than 100 dataset search articles, and then we narrowed those articles to 31 after applying the exclusion criteria. As a result, a new dataset search taxonomy has been designed based on the search style that is used by users to retrieve datasets.
dataset search,taxonomy
AI 理解论文
Chat Paper